← İçerik
Yapay Zeka · 4 dk okuma · 20 Nisan 2026

Interpretable Traces Don't Guarantee Better LLM Reasoning

Research shows Chain-of-Thought traces improve model performance but confuse users, and correctness of intermediate steps barely predicts final accuracy.

Kaynak: arxiv/cs.AI · Siddhant Bhambri, Upasana Biswas, Subbarao Kambhampati · orijinali aç ↗ ↗
Paylaş: X LinkedIn

Correct reasoning traces don't reliably improve LLM accuracy, and verbose traces confuse users despite boosting model performance.

  • Correct intermediate reasoning steps predicted correct final answers only 28% of the time in controlled QA experiments.
  • Incorrect traces failed to consistently degrade model accuracy, suggesting trace semantics matter less than assumed.
  • Verbose DeepSeek R1 traces yielded best model performance but scored lowest on user interpretability (3.39/5).
  • Decomposed, human-readable traces were rated more interpretable but did not match verbose trace performance gains.
  • High cognitive load (4.59/5) accompanied verbose traces despite their training effectiveness.
  • Current practice conflates model supervision objectives with end-user-facing explanation design.
  • Trace correctness and interpretability operate as separate, sometimes opposing, optimization targets.
  • Fine-tuning datasets with verifiably correct versus incorrect traces revealed the disconnect empirically.

Sık sorulanlar

  • No. In the study, correct intermediate reasoning steps led to correct final answers only 28% of the time. This suggests that LLMs may rely on patterns or memorization rather than genuinely following the reasoning steps. Trace correctness and final accuracy are weakly coupled.

İlgili