arXiv Open Access 2025

AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path

Zhengyang Yu Akio Hayakawa Masato Ishii Qingtao Yu Takashi Shibuya +2 lainnya
Lihat Sumber

Abstrak

Autoregressive video diffusion models (AR-VDMs) show strong promise as scalable alternatives to bidirectional VDMs, enabling real-time and interactive applications. Yet there remains room for improvement in their sample fidelity. A promising solution is inference-time alignment, which optimizes the noise space to improve sample fidelity without updating model parameters. Yet, optimization- or search-based methods are computationally impractical for AR-VDMs. Recent text-to-image (T2I) works address this via feedforward noise refiners that modulate sampled noises in a single forward pass. Can such noise refiners be extended to AR-VDMs? We identify the failure of naively extending T2I noise refiners to AR-VDMs and propose AutoRefiner-a noise refiner tailored for AR-VDMs, with two key designs: pathwise noise refinement and a reflective KV-cache. Experiments demonstrate that AutoRefiner serves as an efficient plug-in for AR-VDMs, effectively enhancing sample fidelity by refining noise along stochastic denoising paths.

Topik & Kata Kunci

Penulis (7)

Z

Zhengyang Yu

A

Akio Hayakawa

M

Masato Ishii

Q

Qingtao Yu

T

Takashi Shibuya

J

Jing Zhang

Y

Yuki Mitsufuji

Format Sitasi

Yu, Z., Hayakawa, A., Ishii, M., Yu, Q., Shibuya, T., Zhang, J. et al. (2025). AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path. https://arxiv.org/abs/2512.11203

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓