AI · 4 min read · April 24, 2026
Cross-Entropy Loss Drives Neural Probe Performance, Not Architecture
Pre-registered study shows cross-entropy training inflates logit norms 15x, accounting for most K-way energy probe gains over softmax baselines.
Cross-entropy loss, not bidirectional inference, drives K-way energy probe performance gains; logit scaling explains two-thirds of the effect.
- — K-way energy probe reduction depends critically on cross-entropy at output layer.
- — Removing cross-entropy halves the probe-softmax gap; MSE training produces 15x smaller logit norms.
- — Bidirectional predictive coding shows probe advantage but lacks expected latent movement increase.
- — Temperature scaling removes 66% of probe-softmax gap; 34% reflects representation ranking quality.
- — Study pre-registered to test sensitivity of theoretical reduction to architectural changes.
- — Standard PC with MSE replicates negative result; bPC shows positive but mechanistically unclear result.
- — Logit-scale effects dominate; scale-invariant ranking effects secondary.
Frequently asked
- Cross-entropy loss penalizes incorrect class logits more aggressively as confidence grows, pushing the model to produce larger magnitude outputs to satisfy the loss. MSE and bidirectional predictive coding do not apply this same pressure, resulting in smaller logit scales. This is a direct consequence of the loss function's gradient structure, not the model architecture.