arXiv Open Access 2025

Dress&Dance: Dress up and Dance as You Like It - Technical Preview

Jun-Kun Chen Aayush Bansal Minh Phuoc Vo Yu-Xiong Wang

Lihat Sumber

Abstrak

We present Dress&Dance, a video diffusion framework that generates high quality 5-second-long 24 FPS virtual try-on videos at 1152x720 resolution of a user wearing desired garments while moving in accordance with a given reference video. Our approach requires a single user image and supports a range of tops, bottoms, and one-piece garments, as well as simultaneous tops and bottoms try-on in a single pass. Key to our framework is CondNet, a novel conditioning network that leverages attention to unify multi-modal inputs (text, images, and videos), thereby enhancing garment registration and motion fidelity. CondNet is trained on heterogeneous training data, combining limited video data and a larger, more readily available image dataset, in a multistage progressive manner. Dress&Dance outperforms existing open source and commercial solutions and enables a high quality and flexible try-on experience.

Topik & Kata Kunci

cs.CV cs.LG

Penulis (4)

Jun-Kun Chen

Aayush Bansal

Minh Phuoc Vo

Yu-Xiong Wang

Format Sitasi

APA MLA BibTeX

Chen, J., Bansal, A., Vo, M.P., Wang, Y. (2025). Dress&Dance: Dress up and Dance as You Like It - Technical Preview. https://arxiv.org/abs/2508.21070

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓