Joint Inference of Image Enhancement and Object Detection via Cross-Domain Fusion Transformer
Abstrak
Underwater vision is fundamental to ocean exploration, yet it is frequently impaired by underwater degradation including low contrast, color distortion and blur, thereby presenting significant challenges for underwater object detection (UOD). Most existing methods employ underwater image enhancement as a preprocessing step to improve visual quality prior to detection. However, image enhancement and object detection are optimized for fundamentally different objectives, and directly cascading them leads to feature distribution mismatch. Moreover, prevailing dual-branch architectures process enhancement and detection independently, overlooking multi-scale interactions across domains and thus constraining the learning of cross-domain feature representation. To overcome these limitations, We propose an underwater cross-domain fusion Transformer detector (UCF-DETR). UCF-DETR jointly leverages image enhancement and object detection by exploiting the complementary information from the enhanced and original image domains. Specifically, an underwater image enhancement module is employed to improve visibility. We then design a cross-domain feature pyramid to integrate fine-grained structural details from the enhanced domain with semantic representations from the original domain. Cross-domain query interaction mechanism is introduced to model inter-domain query relationships, leading to accurate object localization and boundary delineation. Extensive experiments on the challenging DUO and UDD benchmarks demonstrate that UCF-DETR consistently outperforms state-of-the-art methods for UOD.
Topik & Kata Kunci
Penulis (2)
Bingxun Zhao
Yuan Chen
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.3390/computers15010043
- Akses
- Open Access ✓