BITCC: A Bidirectional Image–Text Interaction Method for High-Resolution Remote Sensing Image Change Captioning
Abstrak
High-resolution remote sensing image change captioning (RSICC) aims to understand the change content in bitemporal high-resolution remote sensing images and generate corresponding descriptive captions. By presenting change information in the form of natural language, it makes the information more intuitive and easier to communicate, which has garnered widespread attention. However, there are still two challenges in RSICC: First, most existing methods adopt a unidirectional interaction from images to text, resulting in insufficient semantic alignment between images and text, which limits method's performance. Second, in remote sensing images, there are interfering factors such as illumination and climate, leading to overall differences between bitemporal images, which affect the recognition of change information. To address the aforementioned challenges, this article proposes a bidirectional image–text interaction method for high-resolution RSICC (BITCC). BITCC first introduces the image-to-text interaction component based on reconstruction. This approach along with the caption generation component, forms a bidirectional interaction to enhance the semantic correlation between the local change information of the images and the textual information. To address the issue of global discrepancies between bitemporal images, a noise-based change extractor is designed, which reduces the model's focus on irrelevant factors by adding noise. Finally, the images-and-text interaction component constrains the global representations of both modalities through contrastive alignment, enhancing the global semantic consistency between the image and text in the high-level representation. Experiments on two public datasets show that our method outperforms the current state-of-the-art methods.
Topik & Kata Kunci
Penulis (6)
Yingjie Tang
Shou Feng
Yongqi Chen
Jinghe Zhang
Nan Su
Chunhui Zhao
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1109/JSTARS.2025.3629158
- Akses
- Open Access ✓