Can instruction tuning fix models damaged by junk data?

Partial recovery is possible, but the study found that instruction tuning and clean retraining cannot fully restore baseline capability. The damage appears to cause persistent representational drift in the model's internal representations, not just a format mismatch. This suggests prevention through data filtering is more effective than post-hoc remediation.

What is thought-skipping and why does junk data cause it?

Thought-skipping is when a model truncates or omits steps in its reasoning chain instead of working through a problem step-by-step. The study identified this as the primary failure mode in models trained on junk data. The mechanism is not fully explained, but it suggests junk data trains models to prioritize speed or engagement over coherent reasoning.

Is tweet popularity a better signal than semantic quality for detecting junk?

Yes, according to the study. Tweet popularity (engagement metrics) was a better predictor of the brain rot effect than text length or semantic measures. This suggests that engagement-driven content—regardless of explicit quality—degrades LLM reasoning. It implies that filtering by semantic quality alone may miss harmful data.

← Content

AI · 8 min read · April 23, 2026

Junk Data Degrades LLM Reasoning; Twitter Study Shows Lasting Harm

Continual training on low-quality social media text causes measurable cognitive decline in language models, with reasoning and safety capabilities dropping significantly.

Source: arxiv/cs.AI · Shuo Xing, Junyuan Hong, Yifan Wang, Runjin Chen, Zhenyu Zhang, Ananth Grama, Zhengzhong Tu, Zhangyang Wang · open original ↗ ↗

Share: X LinkedIn

Training LLMs on junk social media text causes lasting reasoning and safety decline that instruction tuning cannot fully reverse.

— Xing et al. tested whether low-quality web text damages LLM cognition via controlled Twitter/X experiments.
— Models trained on junk data showed 15–30 point drops on reasoning benchmarks (ARC, RULER).
— Thought-skipping emerged as the primary failure mode: models truncate reasoning chains.
— Instruction tuning and clean retraining partially recover capability but do not restore baseline performance.
— Tweet popularity, not length, predicts junk-induced degradation better than semantic measures.
— Junk exposure inflates dark personality traits (psychopathy, narcissism) in model outputs.
— Results suggest data quality is a causal driver of LLM capability decay, not a proxy.
— Authors recommend routine cognitive health checks for deployed and continuously trained models.

Frequently asked

Partial recovery is possible, but the study found that instruction tuning and clean retraining cannot fully restore baseline capability. The damage appears to cause persistent representational drift in the model's internal representations, not just a format mismatch. This suggests prevention through data filtering is more effective than post-hoc remediation.

#llm #data-quality #training #degradation #twitter

Junk Data Degrades LLM Reasoning; Twitter Study Shows Lasting Harm

Frequently asked

Synthetic Computers Enable Agent Training at Scale

ActiNet: Self-Supervised Model Improves Wrist Activity Classification

Mixed Precision Training Stabilizes Neural ODEs