DOAJ Open Access 2025

Multi-Level Contextual and Semantic Information Aggregation Network for Small Object Detection in UAV Aerial Images

Zhe Liu Guiqing He Yang Hu

Abstrak

In recent years, detection methods for generic object detection have achieved significant progress. However, due to the large number of small objects in aerial images, mainstream detectors struggle to achieve a satisfactory detection performance. The challenges of small object detection in aerial images are primarily twofold: (1) Insufficient feature representation: The limited visual information for small objects makes it difficult for models to learn discriminative feature representations. (2) Background confusion: Abundant background information introduces more noise and interference, causing the features of small objects to easily be confused with the background. To address these issues, we propose a Multi-Level Contextual and Semantic Information Aggregation Network (MCSA-Net). MCSA-Net includes three key components: a Spatial-Aware Feature Selection Module (SAFM), a Multi-Level Joint Feature Pyramid Network (MJFPN), and an Attention-Enhanced Head (AEHead). The SAFM employs a sequence of dilated convolutions to extract multi-scale local context features and combines a spatial selection mechanism to adaptively merge these features, thereby obtaining the critical local context required for the objects, which enriches the feature representation of small objects. The MJFPN introduces multi-level connections and weighted fusion to fully leverage the spatial detail features of small objects in feature fusion and enhances the fused features further through a feature aggregation network. Finally, the AEHead is constructed by incorporating a sparse attention mechanism into the detection head. The sparse attention mechanism efficiently models long-range dependencies by computing the attention between the most relevant regions in the image while suppressing background interference, thereby enhancing the model’s ability to perceive targets and effectively improving the detection performance. Extensive experiments on four datasets, VisDrone, UAVDT, MS COCO, and DOTA, demonstrate that the proposed MCSA-Net achieves an excellent detection performance, particularly in small object detection, surpassing several state-of-the-art methods.

Topik & Kata Kunci

Motor vehicles. Aeronautics. Astronautics

Penulis (3)

Zhe Liu

Guiqing He

Yang Hu

Format Sitasi

APA MLA BibTeX

Liu, Z., He, G., Hu, Y. (2025). Multi-Level Contextual and Semantic Information Aggregation Network for Small Object Detection in UAV Aerial Images. https://doi.org/10.3390/drones9090610

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.3390/drones9090610

Informasi Jurnal

Tahun Terbit: 2025
Sumber Database: DOAJ
DOI: 10.3390/drones9090610
Akses: Open Access ✓