Semantic Scholar Open Access 2023 14 sitasi

3D Video Object Detection with Learnable Object-Centric Global Optimization

Jiawei He Yuntao Chen Naiyan Wang Zhaoxiang Zhang

Lihat Sumber DOI

Abstrak

We explore long-term temporal visual correspondence-based optimization for 3D video object detection in this work. Visual correspondence refers to one-to-one mappings for pixels across multiple images. Correspondence-based optimization is the cornerstone for 3D scene reconstruction but is less studied in 3D video object detection, because moving objects violate multi-view geometry constraints and are treated as outliers during scene reconstruction. We address this issue by treating objects as first-class citizens during correspondence-based optimization. In this work, we propose BA-Det, an end-to-end optimizable object detector with object-centric temporal correspondence learning and featuremetric object bundle adjustment. Empirically, we verify the effectiveness and efficiency of BA-Det for multiple baseline 3D detectors under various setups. Our BA-Det achieves SOTA performance on the large-scale Waymo Open Dataset (WOD) with only marginal computation cost. Our code is available at https://github.com/jiaweihe1996/BA-Det.

Topik & Kata Kunci

Computer Science

Penulis (4)

Jiawei He

Yuntao Chen

Naiyan Wang

Zhaoxiang Zhang

Format Sitasi

APA MLA BibTeX

He, J., Chen, Y., Wang, N., Zhang, Z. (2023). 3D Video Object Detection with Learnable Object-Centric Global Optimization. https://doi.org/10.1109/CVPR52729.2023.00494

Akses Cepat

Lihat di Sumber doi.org/10.1109/CVPR52729.2023.00494

Informasi Jurnal

Tahun Terbit: 2023
Bahasa: en
Total Sitasi: 14×
Sumber Database: Semantic Scholar
DOI: 10.1109/CVPR52729.2023.00494
Akses: Open Access ✓