DOAJ Open Access 2024

MFSA-Net: Semantic Segmentation With Camera-LiDAR Cross-Attention Fusion Based on Fast Neighbor Feature Aggregation

Yijian Duan Liwen Meng Yanmei Meng Jihong Zhu Jiacheng Zhang +2 lainnya

Abstrak

Given the inherent limitations of camera-only and LiDAR-only methods in performing semantic segmentation tasks in large-scale complex environments, multimodal information fusion for semantic segmentation has become a focal point of contemporary research. However, significant modal disparities often result in existing fusion-based methods struggling with low segmentation accuracy and limited efficiency in large-scale complex environments. To address these challenges,we propose a semantic segmentation network with camera–LiDAR cross-attention fusion based on fast neighbor feature aggregation (MFSA-Net), which is better suited for large-scale semantic segmentation in complex environments. Initially, we propose a dual-distance attention feature aggregation module based on rapid 3-D nearest neighbor search. This module employs a sliding window method in point cloud perspective projections for swift proximity search, and efficiently combines feature distance and Euclidean distance information to learn more distinctive local features. This improves segmentation accuracy while ensuring computational efficiency. Furthermore, we propose a cross-attention fusion two-stream network based on residual, which allows for more effective integration of camera information into the LiDAR data stream, enhancing both accuracy and robustness. Extensive experimental results on the large-scale point cloud datasets SemanticKITTI and Nuscenes demonstrate that our proposed algorithm outperforms similar algorithms in semantic segmentation performance in large-scale complex environments.

Penulis (7)

Y

Yijian Duan

L

Liwen Meng

Y

Yanmei Meng

J

Jihong Zhu

J

Jiacheng Zhang

J

Jinlai Zhang

X

Xin Liu

Format Sitasi

Duan, Y., Meng, L., Meng, Y., Zhu, J., Zhang, J., Zhang, J. et al. (2024). MFSA-Net: Semantic Segmentation With Camera-LiDAR Cross-Attention Fusion Based on Fast Neighbor Feature Aggregation. https://doi.org/10.1109/JSTARS.2024.3472751

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1109/JSTARS.2024.3472751
Informasi Jurnal
Tahun Terbit
2024
Sumber Database
DOAJ
DOI
10.1109/JSTARS.2024.3472751
Akses
Open Access ✓