What is the geometric blind spot in supervised learning?

It is a mathematical necessity of empirical risk minimization: any encoder trained to minimize supervised loss must retain non-zero sensitivity (Jacobian) in directions that correlate with training labels but are irrelevant at test time. This is not a bug in current methods but a structural property of the supervised objective itself, proven in Theorem 1.

How does the geometric blind spot relate to adversarial vulnerability?

Adversarial vulnerability is one consequence of the broader blind spot. The encoder's forced sensitivity to label-correlated nuisance directions makes it exploitable by adversarial perturbations. However, the blind spot also explains texture bias, corruption fragility, and the robustness-accuracy tradeoff—four phenomena previously treated as separate problems.

What is Trajectory Deviation Index and why does it matter?

TDI is a diagnostic metric that directly measures isotropic path-length distortion in the encoder Jacobian—the exact quantity bounded by Theorem 1. Unlike Frobenius norm, TDI detects the blind spot even when Jacobian magnitude is high. For example, PGD adversarial training achieves Frobenius 2.91 but TDI 1.336, while PMH achieves TDI 0.904, revealing a dissociation that standard metrics miss.

← Content

AI · 8 min read · April 24, 2026

Supervised Learning Has Built-In Geometric Blindness

Mathematical proof shows empirical risk minimization must preserve sensitivity to label-correlated but test-irrelevant features—a structural constraint, not a training bug.

Source: arxiv/cs.AI · Vishal Rajput · open original ↗ ↗

Share: X LinkedIn

Supervised learning mathematically requires encoders to retain sensitivity to training-label correlations that don't generalize, creating an unavoidable geometric constraint.

— ERM imposes necessary Jacobian sensitivity in directions correlated with labels but irrelevant at test time.
— This constraint unifies four separate empirical phenomena: non-robust features, texture bias, corruption fragility, robustness-accuracy tradeoff.
— Trajectory Deviation Index (TDI) directly measures this blind spot; standard metrics like Frobenius norm miss it.
— PGD adversarial training achieves high Jacobian magnitude but poor clean-input geometry (TDI 1.336 vs PMH 0.904).
— Blind spot worsens in larger language models (ratio 0.860→0.742 from 66M to 340M parameters).
— Task-specific ERM fine-tuning amplifies the blind spot by 54%; PMH repairs it 11x with one Gaussian-form training term.
— Defect appears at foundation-model scale across vision, NLP, and multimodal architectures (CLIP, DINO, SAM, ViT-B/16).
— Proposition 5 proves the repair term is the unique perturbation law that uniformly penalizes encoder Jacobian.

Frequently asked

It is a mathematical necessity of empirical risk minimization: any encoder trained to minimize supervised loss must retain non-zero sensitivity (Jacobian) in directions that correlate with training labels but are irrelevant at test time. This is not a bug in current methods but a structural property of the supervised objective itself, proven in Theorem 1.

#supervised-learning #robustness #geometry #adversarial #representation #jacobian

Supervised Learning Has Built-In Geometric Blindness

Frequently asked

Synthetic Computers Enable Agent Training at Scale

ActiNet: Self-Supervised Model Improves Wrist Activity Classification

Mixed Precision Training Stabilizes Neural ODEs