MSA-UNet: Multiscale Feature Aggregation with Attentive Skip Connections for Precise Building Extraction
Abstrak
An accurate and reliable extraction of building structures from high-resolution (HR) remote sensing images is an important research topic in 3D cartography and smart city construction. However, despite the strong overall performance of recent deep learning models, limitations remain in handling significant variations in building scales and complex architectural forms, which may lead to inaccurate boundaries or difficulties in extracting small or irregular structures. Therefore, the present study proposes MSA-UNet, a reliable semantic segmentation framework that leverages multiscale feature aggregation and attentive skip connections for an accurate extraction of building footprints. This framework is constructed based on the U-Net architecture, incorporating VGG16 as a replacement for the original encoder structure, which enhances its ability to capture low-discriminative features. To further improve the representation of image buildings with different scales and shapes, a serial coarse-to-fine feature aggregation mechanism was used. Additionally, a novel skip connection was built between the encoder and decoder layers to enable adaptive weights. Furthermore, a dual-attention mechanism, implemented through the convolutional block attention module, was integrated to enhance the focus of the network on building regions. Extensive experiments conducted on the WHU and Inria building datasets validated the effectiveness of MSA-UNet. On the WHU dataset, the model demonstrated a state-of-the-art performance with a mean Intersection over Union (mIoU) of 94.26%, accuracy of 98.32%, F1-score of 96.57%, and mean Pixel accuracy (mPA) of 96.85%, corresponding to gains of 1.41% in mIoU over the baseline U-Net. On the more challenging Inria dataset, MSA-UNet achieved an mIoU of 85.92%, indicating a consistent improvement of up to 1.9% over the baseline U-Net. These results confirmed that MSA-UNet markedly improved the accuracy and boundary integrity of building extraction from HR data, outperforming existing classic models in terms of segmentation quality and robustness.
Topik & Kata Kunci
Penulis (6)
Guobiao Yao
Yan Chen
Wenxiao Sun
Zeyu Zhang
Yifei Tang
Jingxue Bi
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.3390/ijgi14120497
- Akses
- Open Access ✓