arXiv Open Access 2025

AstraNav-World: World Model for Foresight Control and Consistency

Jintao Chen Junjun Hu Haochen Bai Minghua Luo Xinda Xue +9 lainnya
Lihat Sumber

Abstrak

Embodied navigation in open, dynamic environments demands accurate foresight of how the world will evolve and how actions will unfold over time. We propose AstraNav-World, an end-to-end world model that jointly reasons about future visual states and action sequences within a unified probabilistic framework. Our framework integrates a diffusion-based video generator with a vision-language policy, enabling synchronized rollouts where predicted scenes and planned actions are updated simultaneously. Training optimizes two complementary objectives: generating action-conditioned multi-step visual predictions and deriving trajectories conditioned on those predicted visuals. This bidirectional constraint makes visual predictions executable and keeps decisions grounded in physically consistent, task-relevant futures, mitigating cumulative errors common in decoupled "envision-then-plan" pipelines. Experiments across diverse embodied navigation benchmarks show improved trajectory accuracy and higher success rates. Ablations confirm the necessity of tight vision-action coupling and unified training, with either branch removal degrading both prediction quality and policy reliability. In real-world testing, AstraNav-World demonstrated exceptional zero-shot capabilities, adapting to previously unseen scenarios without any real-world fine-tuning. These results suggest that AstraNav-World captures transferable spatial understanding and planning-relevant navigation dynamics, rather than merely overfitting to simulation-specific data distribution. Overall, by unifying foresight vision and control within a single generative model, we move closer to reliable, interpretable, and general-purpose embodied agents that operate robustly in open-ended real-world settings.

Topik & Kata Kunci

Penulis (14)

J

Jintao Chen

J

Junjun Hu

H

Haochen Bai

M

Minghua Luo

X

Xinda Xue

B

Botao Ren

C

Chengyu Bai

S

Shichao Xie

Z

Ziyi Chen

F

Fei Liu

Z

Zedong Chu

X

Xiaolong Wu

M

Mu Xu

S

Shanghang Zhang

Format Sitasi

Chen, J., Hu, J., Bai, H., Luo, M., Xue, X., Ren, B. et al. (2025). AstraNav-World: World Model for Foresight Control and Consistency. https://arxiv.org/abs/2512.21714

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓