DOAJ Open Access 2026

Geometric Prior-Guided Multimodal Spatiotemporal Adaptive Motion Estimation for Monocular Vision-Based MAVs

Yu Luo Hao Cha Hongwei Fu Tingting Fu Bin Tian +1 lainnya

Abstrak

Estimating the relative position and velocity of micro aerial vehicles (MAVs) using visual signals is a critical issue in numerous tasks. However, traditional relative motion estimation algorithms suffer severely from non-Gaussian noise interference and have limited observability, making it difficult to meet the practical requirements of complex dynamic scenarios. To address this dilemma, this paper proposes a Multimodal Decoupled Spatiotemporal Adaptive Network (MDSAN). Designed for air-to-air scenarios, MDSAN achieves high-precision relative pose and velocity estimation of dynamic MAVs while overcoming the observability limitations of traditional algorithms. In detail, MDSAN is collaboratively composed of two core sub-modules: Modality-Specific Convolutional Normalization (MSCN) blocks and Spatiotemporal Adaptive State (STAS) blocks. Specifically, MSCN uses custom convolution kernels tailored to three modalities—visual, physical, and geometric—to separate their features. This prevents interference between modalities and reduces non-Gaussian noise. STAS, built on a state-space model, combines two key functions: it tracks long-term MAV motion trends over time and strengthens the synergy between different modal features across space. Adaptive weights balance these two functions, enabling stable estimation, even when traditional methods struggle with low observability. Furthermore, MDSAN adopts a full-vision multimodal fusion scheme, completely eliminating the dependence on wireless communication and reducing hardware costs. Extensive experimental results demonstrate that MDSAN achieves the best performance in all scenarios, significantly outperforming existing motion estimation algorithms. It provides a new technical path that balances high precision, high robustness, and cost-effectiveness for technologies such as MAV swarm perception.

Penulis (6)

Y

Yu Luo

H

Hao Cha

H

Hongwei Fu

T

Tingting Fu

B

Bin Tian

H

Huatao Tang

Format Sitasi

Luo, Y., Cha, H., Fu, H., Fu, T., Tian, B., Tang, H. (2026). Geometric Prior-Guided Multimodal Spatiotemporal Adaptive Motion Estimation for Monocular Vision-Based MAVs. https://doi.org/10.3390/drones10020083

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/drones10020083
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.3390/drones10020083
Akses
Open Access ✓