Yapay Zeka · 4 dk okuma · 20 Nisan 2026
Neural CTMC decouples discrete diffusion into timing and direction
A new parameterization for discrete diffusion models separates when and where tokens jump, aligning training with mathematical structure.
Kaynak: arxiv/cs.LG · Jingyuan Li, Xiaoyi Jiang, Fukang Wen, Wei Liu, Renqian Luo, Yi Zhu, Zuoqiang Shi, Pipi Hu · orijinali aç ↗ ↗
Neural CTMC splits discrete diffusion reverse process into exit rate and jump distribution, matching Poisson process fundamentals.
- — Existing discrete diffusion models treat the reverse rate matrix as one unit; Neural CTMC splits it into two components.
- — Exit rate network learns when to jump; jump distribution network learns where to jump in token space.
- — ELBO training objective factors into Poisson KL for timing and categorical KL for direction, decoupling optimization.
- — Theoretical proof shows conditional surrogate preserves gradients and minimizers of marginal reverse-process objective.
- — Framework handles masked and GIDD-style noise schedules within the same decomposed structure.
- — Uniform forward process with Neural CTMC outperforms mask-based methods on OpenWebText without special masking.
- — Pretrained weights released on Hugging Face for reproducibility and downstream use.
Sık sorulanlar
- A CTMC models state transitions over continuous time. It is fully determined by two quantities: a Poisson process that governs jump timing (when transitions occur) and a categorical distribution that governs jump direction (which state to jump to). Neural CTMC exploits this mathematical structure by training separate networks for each, rather than learning a monolithic rate matrix.