← İçerik
Yapay Zeka · 5 dk okuma · 25 Nisan 2026

Frequency-Forcing: Guiding Image Generation via Soft Auxiliary Streams

A new approach to flow-matching models uses lightweight learnable wavelets to guide pixel generation toward coarse structure first, improving image synthesis without hard constraints.

Kaynak: arxiv/cs.AI · Weitao Du · orijinali aç ↗ ↗
Paylaş: X LinkedIn

Frequency-Forcing guides image generation through soft auxiliary low-frequency streams instead of hard frequency constraints, improving synthesis quality.

  • Flow-matching models benefit from generating coarse structure before fine detail, mimicking natural image formation.
  • K-Flow enforces frequency ordering by reinterpreting frequency scaling as time; Latent Forcing uses semantic auxiliary flows.
  • Frequency-Forcing combines both paradigms: soft guidance via an auxiliary low-frequency stream that matures earlier.
  • Self-forcing signal derives from learnable wavelet packet transforms applied to data, avoiding external pretrained encoders.
  • On ImageNet-256, method outperforms pixel and latent-space baselines; composes with semantic streams for further gains.
  • Forcing-based ordering preserves the core flow coordinate system, offering modularity over hard constraint rewrites.

Sık sorulanlar

  • K-Flow imposes a hard frequency constraint by reinterpreting frequency scaling as flow time and operating in transformed amplitude space. Frequency-Forcing achieves the same frequency-ordered generation through soft guidance: an auxiliary low-frequency stream matures earlier and guides the main pixel flow without rewriting the core flow coordinate system. This makes Frequency-Forcing more modular and composable.

İlgili