DOAJ Open Access 2025

RingFormer-Seg: A Scalable and Context-Preserving Vision Transformer Framework for Semantic Segmentation of Ultra-High-Resolution Remote Sensing Imagery

Zhan Zhang Daoyu Shu Guihe Gu Wenkai Hu Ru Wang +2 lainnya

Abstrak

Semantic segmentation of ultra-high-resolution remote sensing (UHR-RS) imagery plays a critical role in land use and land cover analysis, yet it remains computationally intensive due to the enormous input size and high spatial complexity. Existing studies have commonly employed strategies such as patch-wise processing, multi-scale model architectures, lightweight networks, and representation sparsification to reduce resource demands, but they have often struggled to maintain long-range contextual awareness and scalability for inputs of arbitrary size. To address this, we propose RingFormer-Seg, a scalable Vision Transformer framework that enables long-range context learning through multi-device parallelism in UHR-RS image segmentation. RingFormer-Seg decomposes the input into spatial subregions and processes them through a distributed three-stage pipeline. First, the Saliency-Aware Token Filter (STF) selects informative tokens to reduce redundancy. Next, the Efficient Local Context Module (ELCM) enhances intra-region features via memory-efficient attention. Finally, the Cross-Device Context Router (CDCR) exchanges token-level information across devices to capture global dependencies. Fine-grained detail is preserved through the residual integration of unselected tokens, and a hierarchical decoder generates high-resolution segmentation outputs. We conducted extensive experiments on three benchmarks covering UHR-RS images from 2048 × 2048 to 8192 × 8192 pixels. Results show that our framework achieves top segmentation accuracy while significantly improving computational efficiency across the DeepGlobe, Wuhan, and Guangdong datasets. RingFormer-Seg offers a versatile solution for UHR-RS image segmentation and demonstrates potential for practical deployment in nationwide land cover mapping, supporting informed decision-making in land resource management, environmental policy planning, and sustainable development.

Topik & Kata Kunci

Penulis (7)

Z

Zhan Zhang

D

Daoyu Shu

G

Guihe Gu

W

Wenkai Hu

R

Ru Wang

X

Xiaoling Chen

B

Bingnan Yang

Format Sitasi

Zhang, Z., Shu, D., Gu, G., Hu, W., Wang, R., Chen, X. et al. (2025). RingFormer-Seg: A Scalable and Context-Preserving Vision Transformer Framework for Semantic Segmentation of Ultra-High-Resolution Remote Sensing Imagery. https://doi.org/10.3390/rs17173064

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/rs17173064
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.3390/rs17173064
Akses
Open Access ✓