arXiv Open Access 2024

D$^2$-World: An Efficient World Model through Decoupled Dynamic Flow

Haiming Zhang Xu Yan Ying Xue Zixuan Guo Shuguang Cui +2 lainnya
Lihat Sumber

Abstrak

This technical report summarizes the second-place solution for the Predictive World Model Challenge held at the CVPR-2024 Workshop on Foundation Models for Autonomous Systems. We introduce D$^2$-World, a novel World model that effectively forecasts future point clouds through Decoupled Dynamic flow. Specifically, the past semantic occupancies are obtained via existing occupancy networks (e.g., BEVDet). Following this, the occupancy results serve as the input for a single-stage world model, generating future occupancy in a non-autoregressive manner. To further simplify the task, dynamic voxel decoupling is performed in the world model. The model generates future dynamic voxels by warping the existing observations through voxel flow, while remaining static voxels can be easily obtained through pose transformation. As a result, our approach achieves state-of-the-art performance on the OpenScene Predictive World Model benchmark, securing second place, and trains more than 300% faster than the baseline model. Code is available at https://github.com/zhanghm1995/D2-World.

Topik & Kata Kunci

Penulis (7)

H

Haiming Zhang

X

Xu Yan

Y

Ying Xue

Z

Zixuan Guo

S

Shuguang Cui

Z

Zhen Li

B

Bingbing Liu

Format Sitasi

Zhang, H., Yan, X., Xue, Y., Guo, Z., Cui, S., Li, Z. et al. (2024). D$^2$-World: An Efficient World Model through Decoupled Dynamic Flow. https://arxiv.org/abs/2411.17027

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓