AeroYOLO: Efficient Multiscale and Attention-Augmented YOLOv8s for Robust Aerial Object Detection
Abstrak
Aerial object detection suffers from scale and spatial imbalance, significantly reducing detection accuracy in drone-based datasets. We propose progressively enhanced YOLOv8s-based models AeroYOLO-Fusion, AeroYOLO-Attn, and AeroYOLO-Lite addressing imbalance problems and efficiency challenges through multiscale fusion, attention mechanism, and lightweight architecture. To improve multiscale feature fusion, AeroYOLO-Fusion integrates bidirectional feature pyramid networks with multiscale depth-wise convolution. To enhance adaptive spatial attention, AeroYOLO-Attn introduces the receptive field attention convolution within the standard C2f module. AeroYOLO-Lite further reduces computational complexity with a lightweight shared group convolutional detection head. Extensive experiments on VisDrone, UAVDT, CARPK, and DIOR datasets demonstrate significant performance improvements over the baseline YOLOv8s, with AeroYOLO-Lite achieving AP increases of 2.80% on VisDrone, 4.3% on UAVDT, 4.1% on CARPK, and 1.0% on DIOR. The inference latency of 13.7ms demonstrates the model’s capability to meet real time detection requirements. Comparative analyses confirm AeroYOLO-Lite’s superior accuracy relative to state-of-the-art methods, while ablation studies validate the contributions of each proposed module, balancing computational efficiency and detection performance.
Topik & Kata Kunci
Penulis (1)
Huiyao Zhang
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1109/ACCESS.2025.3610617
- Akses
- Open Access ✓