← İçerik
Yapay Zeka · 8 dk okuma · 21 Nisan 2026

Chain-of-Thought Supervision Eliminates Sample Complexity Growth

New theoretical analysis shows intermediate reasoning steps remove dependence on generation length, while end-to-end learning scales unpredictably with sequence depth.

Kaynak: arxiv/cs.LG · Steve Hanneke, Idan Mehalel, Shay Moran · orijinali aç ↗ ↗
Paylaş: X LinkedIn

Chain-of-Thought supervision decouples sample complexity from generation length; end-to-end learning exhibits variable scaling.

  • Autoregressive models learn by iterating a next-token generator T times; final token is the output.
  • End-to-End supervision reveals only final outputs; Chain-of-Thought reveals all intermediate tokens.
  • End-to-End sample complexity can scale anywhere from constant to linear with generation length T.
  • Chain-of-Thought supervision makes sample complexity independent of T entirely.
  • Intermediate reasoning access eliminates the penalty of longer generation chains.
  • Analysis resolves open questions about how generation length affects learnability.
  • New combinatorial tools introduced to characterize this taxonomy of scaling behaviors.
  • Result applies to PAC-learning framework for next-token prediction systems.

Sık sorulanlar

  • End-to-End supervision provides only the final output token after a model generates T intermediate tokens. Chain-of-Thought supervision reveals all T intermediate tokens produced during generation. This paper shows that access to intermediate tokens eliminates the penalty of longer generation chains on sample complexity, while end-to-end learning may require more data as chains grow.

İlgili