Yapay Zeka · 8 dk okuma · 17 Nisan 2026
INT4 Quantization Fails After FP32 Convergence in Predictable Phases
Post-training quantization assumes converged models are ready to compress, but INT4 quantization collapses in a three-phase pattern tied to weight updates, not learning rate decay.
INT4 quantization fails after FP32 convergence in three phases: improvement, plateau, then explosive divergence caused by post-convergence weight updates.
- — Three-phase divergence: rapid learning, meta-stable plateau (~70k steps), explosive INT4 gap growth (11% to 517%).
- — Divergence onset correlates with FP32 perplexity convergence, not learning rate decay schedule.
- — INT8 quantization remains robust across all phases; failure is specific to INT4's 16-level grid coarseness.
- — Weight outlier accumulation ruled out via kurtosis measurement; mechanism remains in weight distribution shift.
- — Oscillatory Lock-In schedule reduces INT4 gap by 2.2 percentage points; SGDR accelerates divergence uniformly.
- — Study audits all 154 Pythia-160m public checkpoints with calibration-free per-group INT4 probe.
- — Post-convergence weight updates, not decay magnitude alone, are the proximate cause of quantization collapse.
- — Schedule amplitude calibration determines whether perturbation helps or hurts quantization robustness.
Sık sorulanlar
- Post-convergence weight updates shift the weight distribution in ways that exceed INT4's 16-level quantization grid resolution. The divergence is not caused by learning rate decay magnitude alone, but by the specific pattern of weight changes after FP32 perplexity stops improving. INT8's finer grid (256 levels) remains robust, suggesting the failure is tied to INT4's coarseness.