DOAJ Open Access 2025

Unified Multi-Modal Object Tracking Through Spatial–Temporal Propagation and Modality Synergy

Jiajia Wu Haorui Zuo Yuxing Wei Meihui Li Jianlin Zhang

Abstrak

Multi-modal object tracking (MMOT) has received widespread attention for the ability to overcome single-sensor perception limitations. However, existing methods encounter several critical challenges. Representation learning and generalization capabilities of models are constrained by the inherent heterogeneity of cross-task multi-modal data and inter-modal synergy imbalance. Particularly, in dynamically changing complex scenarios, the reliability and stability of data significantly degrade, further exacerbating the difficulty in multi-modal consistent perception and aggregation. To tackle the above issues, we propose SMUTrack, a unified framework with global shared parameters integrating three downstream MMOT tasks. SMUTrack implements a batch merging-and-splitting alternating strategy, coupled with multi-task joint training, to establish latent correlations across inter- and intra-task modalities, effectively avoiding over-reliance on certain modalities. Concurrently, we design a hierarchical modality synergy and reinforcement (HMSR) module, and a gated fusion and context awareness (GFCA) module to enable progressive multi-modal information exchange and integration, yielding the more discriminative and robust multi-modal representation. More importantly, we introduce a spatial–temporal information propagation (SIP) mechanism, which synchronously learns object trajectory cues and appearance variations to effectively build contextual relationships in long-term video tracking. Experimental results definitively validate the outstanding performance of SMUTrack on mainstream MMOT datasets, exhibiting its powerful adaptability to various MMOT tasks.

Penulis (5)

J

Jiajia Wu

H

Haorui Zuo

Y

Yuxing Wei

M

Meihui Li

J

Jianlin Zhang

Format Sitasi

Wu, J., Zuo, H., Wei, Y., Li, M., Zhang, J. (2025). Unified Multi-Modal Object Tracking Through Spatial–Temporal Propagation and Modality Synergy. https://doi.org/10.3390/jimaging11120421

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/jimaging11120421
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.3390/jimaging11120421
Akses
Open Access ✓