MGLI-Former: a multi-scale and global-local information interactive attention transformer for urban shantytown extraction
Abstrak
Shantytowns, characterized by poor living conditions and simple houses, necessitate efficient extraction and analysis for urban planning. This paper proposes a multi-scale and global-local information interactive attention transformer (MGLI-Former) for shantytown extraction from high-resolution remote sensing images. First, the multi-level feature fusion block (MLFFB) integrates neighborhood encoding features to prevent the loss of small-target shantytowns. Second, joint global and local information transformer blocks (JGLB) effectively combine global and local features. Finally, boundary and feature joint optimization loss (BF-Loss) refines the output by edges and high-level semantics. Experiments in Beijing and Shanghai demonstrate the MGLI-Former achieved optimal visual and quantitative extraction evaluations. The F1-score, IoU, Precision, and Recall are 86.92%, 76.87%, 86.84%, 87.01% and 72.33%, 56.66%, 69.29%, 75.65%, respectively. Furthermore, the use of UIS-Shenzhen datasets and fine-tuning experiments with mixed datasets further validate the robustness and generalization capabilities of MGLI-Former. Moreover, spatial and landscape patterns of shantytowns in Beijing and Shanghai reveal: (1) Beijing’s shantytowns radiate uniformly from the old city center, whereas Shanghai exhibits a multi-core diffusion pattern. (2) Shanghai's shantytown distribution is clustered, while Beijing's shantytown distribution is more uniform. MGLI-Former demonstrates the potential for extracting shantytowns and has significant urban planning and management implications.
Topik & Kata Kunci
Penulis (7)
Shouhang Du
Shaoyu Wang
Yuhao Hua
Shu Peng
Fei Qin
Xue Li
Yufei Wu
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2024
- Sumber Database
- DOAJ
- DOI
- 10.1080/17538947.2024.2432522
- Akses
- Open Access ✓