← İçerik
Yapay Zeka · 4 dk okuma · 20 Nisan 2026

Neural CTMC decouples discrete diffusion into timing and direction

A new parameterization for discrete diffusion models separates when and where tokens jump, aligning training with mathematical structure.

Kaynak: arxiv/cs.LG · Jingyuan Li, Xiaoyi Jiang, Fukang Wen, Wei Liu, Renqian Luo, Yi Zhu, Zuoqiang Shi, Pipi Hu · orijinali aç ↗ ↗
Paylaş: X LinkedIn

Neural CTMC splits discrete diffusion reverse process into exit rate and jump distribution, matching Poisson process fundamentals.

  • Existing discrete diffusion models treat the reverse rate matrix as one unit; Neural CTMC splits it into two components.
  • Exit rate network learns when to jump; jump distribution network learns where to jump in token space.
  • ELBO training objective factors into Poisson KL for timing and categorical KL for direction, decoupling optimization.
  • Theoretical proof shows conditional surrogate preserves gradients and minimizers of marginal reverse-process objective.
  • Framework handles masked and GIDD-style noise schedules within the same decomposed structure.
  • Uniform forward process with Neural CTMC outperforms mask-based methods on OpenWebText without special masking.
  • Pretrained weights released on Hugging Face for reproducibility and downstream use.

Sık sorulanlar

  • A CTMC models state transitions over continuous time. It is fully determined by two quantities: a Poisson process that governs jump timing (when transitions occur) and a categorical distribution that governs jump direction (which state to jump to). Neural CTMC exploits this mathematical structure by training separate networks for each, rather than learning a monolithic rate matrix.

İlgili