Comparison of Neural Network Architectures for Spatial Orientation of Unmanned Aerial Vehicles and Ground Drones
Abstrak
Introduction. Ensuring spatial orientation of unmanned aerial vehicles (UAVs) and ground drones is one of the key tasks in modern robotics and autonomous systems. Traditional methods, such as SLAM or GNSS-IMU integration, have significant limitations in complex environments where sensor data is noisy or satellite signals are unavailable. In such cases, deep learning approaches, in particular neural network architectures for segmentation and object detection, are promising as they enable drones to interpret their surroundings at the semantic level. Purpose. The purpose of this work is to conduct a comparative analysis of segmentation neural network architectures (U-Net, DeepLabV3+) and object detection architectures (YOLO, SSD, Faster R-CNN) regarding their applicability to solving the problem of spatial orientation of UAVs and ground drones in real time, as well as to identify the advantages and disadvantages of both approaches depending on the operating environment and computational resources. Methods. The study employed an experimental approach using UAVid datasets and synthetic aerial imagery. To evaluate the effectiveness of the selected architectures, the following indicators were analyzed: accuracy (mIoU, mAP), speed (FPS), and resource requirements (number of parameters, computational complexity). Special attention was given to the use of segmentation models for constructing semantic maps of the environment and object detectors for localizing reference landmarks. Results. The experiments demonstrated that segmentation architectures (U-Net, DeepLabV3+) provide more precise representation of object shapes and allow the construction of highly detailed maps, which is critical in environments with irregular obstacles (forests, mountainous areas). At the same time, object detectors (YOLO, SSD) showed significantly higher real-time performance, making them more suitable for systems with limited computational resources. Faster R-CNN achieved higher accuracy but lagged in processing speed. It was shown that segmentation models make it possible to estimate traversability between objects and classify surface types, tasks that are unattainable with traditional bounding box detectors. Conclusions. Segmentation architectures provide drones with richer semantic information for spatial orientation but require higher computational resources and demonstrate lower inference speed. Object detection architectures are capable of real-time operation but at the cost of reduced environmental detail. A combined approach, applying both methods depending on navigation tasks and resource constraints, is considered promising. Future research should focus on developing hybrid multi-output models that combine the advantages of segmentation and detection.
Topik & Kata Kunci
Penulis (1)
Oleksandr Suslenko
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.34229/2707-451X.25.4.9
- Akses
- Open Access ✓