arXiv Open Access 2025

A Self-supervised Motion Representation for Portrait Video Generation

Qiyuan Zhang Chenyu Wu Wenzhang Sun Huaize Liu Donglin Di +2 lainnya
Lihat Sumber

Abstrak

Recent advancements in portrait video generation have been noteworthy. However, existing methods rely heavily on human priors and pre-trained generative models, Motion representations based on human priors may introduce unrealistic motion, while methods relying on pre-trained generative models often suffer from inefficient inference. To address these challenges, we propose Semantic Latent Motion (SeMo), a compact and expressive motion representation. Leveraging this representation, our approach achieve both high-quality visual results and efficient inference. SeMo follows an effective three-step framework: Abstraction, Reasoning, and Generation. First, in the Abstraction step, we use a carefully designed Masked Motion Encoder, which leverages a self-supervised learning paradigm to compress the subject's motion state into a compact and abstract latent motion (1D token). Second, in the Reasoning step, we efficiently generate motion sequences based on the driving audio signal. Finally, in the Generation step, the motion dynamics serve as conditional information to guide the motion decoder in synthesizing realistic transitions from reference frame to target video. Thanks to the compact and expressive nature of Semantic Latent Motion, our method achieves efficient motion representation and high-quality video generation. User studies demonstrate that our approach surpasses state-of-the-art models with an 81% win rate in realism. Extensive experiments further highlight its strong compression capability, reconstruction quality, and generative potential.

Topik & Kata Kunci

Penulis (7)

Q

Qiyuan Zhang

C

Chenyu Wu

W

Wenzhang Sun

H

Huaize Liu

D

Donglin Di

W

Wei Chen

C

Changqing Zou

Format Sitasi

Zhang, Q., Wu, C., Sun, W., Liu, H., Di, D., Chen, W. et al. (2025). A Self-supervised Motion Representation for Portrait Video Generation. https://arxiv.org/abs/2503.10096

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓