RoboBrain 2.5: Depth in Sight, Time in Mind
Abstrak
We introduce RoboBrain 2.5, a next-generation embodied AI foundation model that advances general perception, spatial reasoning, and temporal modeling through extensive training on high-quality spatiotemporal supervision. Building upon its predecessor, RoboBrain 2.5 introduces two major capability upgrades. Specifically, it unlocks Precise 3D Spatial Reasoning by shifting from 2D pixel-relative grounding to depth-aware coordinate prediction and absolute metric constraint comprehension, generating complete 3D manipulation traces as ordered keypoint sequences under physical constraints. Complementing this spatial precision, the model establishes Dense Temporal Value Estimation that provides dense, step-aware progress prediction and execution state understanding across varying viewpoints, producing stable feedback signals for downstream learning. Together, these upgrades extend the framework toward more physically grounded and execution-aware embodied intelligence for complex, fine-grained manipulation. The code and checkpoints are available at project website: https://superrobobrain.github.io
Topik & Kata Kunci
Penulis (35)
Huajie Tan
Enshen Zhou
Zhiyu Li
Yijie Xu
Yuheng Ji
Xiansheng Chen
Cheng Chi
Pengwei Wang
Huizhu Jia
Yulong Ao
Mingyu Cao
Sixiang Chen
Zhe Li
Mengzhen Liu
Zixiao Wang
Shanyu Rong
Yaoxu Lyu
Zhongxia Zhao
Peterson Co
Yibo Li
Yi Han
Shaoxuan Xie
Guocai Yao
Songjing Wang
Leiduo Zhang
Xi Yang
Yance Jiao
Donghai Shi
Kunchang Xie
Shaokai Nie
Chunlei Men
Yonghua Lin
Zhongyuan Wang
Tiejun Huang
Shanghang Zhang
Akses Cepat
- Tahun Terbit
- 2026
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓