AI · 4 min read · April 21, 2026
Weak Labels Fail Across Time Even When Domain Transfer Works
A study of CRISPR experiments reveals supervision drift—where the labeling mechanism itself shifts—causes model collapse in temporal transfer despite strong in-domain performance.
Source: arxiv/cs.LG · Mehrdad Shoeibi, Elias Hossain, Ivan Garibay, Niloofar Yousefi · open original ↗ ↗
Weak supervision works within a domain but fails over time because the labeling mechanism itself drifts, not just the data distribution.
- — Supervision drift occurs when P(y|x,c) changes across contexts, distinct from standard covariate shift.
- — CRISPR-Cas13d transcriptomics benchmark shows in-domain weak-label learning achieves R² ≈ 0.36, Spearman ρ ≈ 0.44.
- — Cross-cell-line transfer partially succeeds (ρ ≈ 0.40) but temporal transfer collapses (R² = −0.15 to −0.32).
- — Feature-label associations remain stable across cell lines but shift sharply over time.
- — Feature stability diagnostics can flag non-transferability before deployment without retraining.
- — Strong in-domain metrics mask downstream failure risk under temporal distribution shift.
- — Externally recomputed labels and shift-score analysis confirm supervision drift, not model capacity limits.
Frequently asked
- Supervision drift occurs when the relationship between features and labels—the labeling mechanism itself—changes across contexts, such as over time. Covariate shift refers only to changes in the feature distribution P(x). A model can handle covariate shift but fail under supervision drift because the target function P(y|x,c) is no longer stable, making historical labels unreliable guides for new data.