Semantic Scholar Open Access 2024 24 sitasi

PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

Guotao Xie Zhiyuan Chen Ming Gao Manjiang Hu Xiaohui Qin

Abstrak

Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.

Topik & Kata Kunci

Computer Science

Penulis (5)

Guotao Xie

Zhiyuan Chen

Ming Gao

Manjiang Hu

Xiaohui Qin

Format Sitasi

APA MLA BibTeX

Xie, G., Chen, Z., Gao, M., Hu, M., Qin, X. (2024). PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection. https://doi.org/10.1109/TITS.2023.3347078

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.1109/TITS.2023.3347078

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Total Sitasi: 24×
Sumber Database: Semantic Scholar
DOI: 10.1109/TITS.2023.3347078
Akses: Open Access ✓