arXiv Open Access 2025

AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path

Zhengyang Yu Akio Hayakawa Masato Ishii Qingtao Yu Takashi Shibuya +2 lainnya

Lihat Sumber

Abstrak

Autoregressive video diffusion models (AR-VDMs) show strong promise as scalable alternatives to bidirectional VDMs, enabling real-time and interactive applications. Yet there remains room for improvement in their sample fidelity. A promising solution is inference-time alignment, which optimizes the noise space to improve sample fidelity without updating model parameters. Yet, optimization- or search-based methods are computationally impractical for AR-VDMs. Recent text-to-image (T2I) works address this via feedforward noise refiners that modulate sampled noises in a single forward pass. Can such noise refiners be extended to AR-VDMs? We identify the failure of naively extending T2I noise refiners to AR-VDMs and propose AutoRefiner-a noise refiner tailored for AR-VDMs, with two key designs: pathwise noise refinement and a reflective KV-cache. Experiments demonstrate that AutoRefiner serves as an efficient plug-in for AR-VDMs, effectively enhancing sample fidelity by refining noise along stochastic denoising paths.

Topik & Kata Kunci

cs.CV

Penulis (7)

Zhengyang Yu

Akio Hayakawa

Masato Ishii

Qingtao Yu

Takashi Shibuya

Jing Zhang

Yuki Mitsufuji

Format Sitasi

APA MLA BibTeX

Yu, Z., Hayakawa, A., Ishii, M., Yu, Q., Shibuya, T., Zhang, J. et al. (2025). AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path. https://arxiv.org/abs/2512.11203

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓