Abstract:Change detection, as a research hotspot in the field of remote sensing, has witnessed continuous development and progress. However, the discrimination of boundary details remains a significant bottleneck due to the complexity of surrounding elements between change areas and backgrounds. Discriminating the boundaries of large change areas results in misalignment, while connecting boundaries occurs for small change targets. To address the above issues, a novel network based on the localization-then-refinement strategy is proposed in this paper, namely LRNet. LRNet consists of two stages: localization and refinement. In the localization stage, a three-branch encoder simultaneously extracts original image features and their differential features for interactive localization of the position of each change area. To minimize information loss during feature extraction, learnable optimal pooling (LOP) is proposed to replace the widely used max-pooling. Additionally, this process is trainable and contributes to the overall optimization of the network. To effectively interact features from different branches and accurately locate change areas of various sizes, change alignment attention (C2A) and hierarchical change alignment module (HCA) are proposed. In the refinement stage, the localization results from the localization stage are corrected by constraining the change areas and change edges through the edge-area alignment module (E2A). Subsequently, the decoder, combined with the difference features strengthened by C2A in the localization phase, refines change areas of different sizes, ultimately achieving accurate boundary discrimination of change areas. The proposed LRNet outperforms 13 other state-of-the-art methods in terms of comprehensive evaluation metrics and provides the most precise boundary discrimination results on the LEVIR-CD and WHU-CD datasets.
Abstract:Remote sensing image change detection aims to identify the differences between images acquired at different times in the same area. It is widely used in land management, environmental monitoring, disaster assessment and other fields. Currently, most change detection methods are based on Siamese network structure or early fusion structure. Siamese structure focuses on extracting object features at different times but lacks attention to change information, which leads to false alarms and missed detections. Early fusion (EF) structure focuses on extracting features after the fusion of images of different phases but ignores the significance of object features at different times for detecting change details, making it difficult to accurately discern the edges of changed objects. To address these issues and obtain more accurate results, we propose a novel network, Triplet UNet(T-UNet), based on a three-branch encoder, which is capable to simultaneously extract the object features and the change features between the pre- and post-time-phase images through triplet encoder. To effectively interact and fuse the features extracted from the three branches of triplet encoder, we propose a multi-branch spatial-spectral cross-attention module (MBSSCA). In the decoder stage, we introduce the channel attention mechanism (CAM) and spatial attention mechanism (SAM) to fully mine and integrate detailed textures information at the shallow layer and semantic localization information at the deep layer.