Alert button
Picture for Chen Wu

Chen Wu

Alert button

Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange

Oct 01, 2023
Hongruixuan Chen, Jian Song, Chen Wu, Bo Du, Naoto Yokoya

Figure 1 for Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange
Figure 2 for Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange
Figure 3 for Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange
Figure 4 for Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange

Change detection (CD) is a critical task in studying the dynamics of ecosystems and human activities using multi-temporal remote sensing images. While deep learning has shown promising results in CD tasks, it requires a large number of labeled and paired multi-temporal images to achieve high performance. Pairing and annotating large-scale multi-temporal remote sensing images is both expensive and time-consuming. To make deep learning-based CD techniques more practical and cost-effective, we propose an unsupervised single-temporal CD framework based on intra- and inter-image patch exchange (I3PE). The I3PE framework allows for training deep change detectors on unpaired and unlabeled single-temporal remote sensing images that are readily available in real-world applications. The I3PE framework comprises four steps: 1) intra-image patch exchange method is based on an object-based image analysis method and adaptive clustering algorithm, which generates pseudo-bi-temporal image pairs and corresponding change labels from single-temporal images by exchanging patches within the image; 2) inter-image patch exchange method can generate more types of land-cover changes by exchanging patches between images; 3) a simulation pipeline consisting of several image enhancement methods is proposed to simulate the radiometric difference between pre- and post-event images caused by different imaging conditions in real situations; 4) self-supervised learning based on pseudo-labels is applied to further improve the performance of the change detectors in both unsupervised and semi-supervised cases. Extensive experiments on two large-scale datasets demonstrate that I3PE outperforms representative unsupervised approaches and achieves F1 value improvements of 10.65% and 6.99% to the SOTA method. Moreover, I3PE can improve the performance of the ... (see the original article for full abstract)

Viaarxiv icon

SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images

Aug 28, 2023
Haonan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang

Figure 1 for SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images
Figure 2 for SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images
Figure 3 for SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images
Figure 4 for SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images

Change detection (CD) is a fundamental and important task for monitoring the land surface dynamics in the earth observation field. Existing deep learning-based CD methods typically extract bi-temporal image features using a weight-sharing Siamese encoder network and identify change regions using a decoder network. These CD methods, however, still perform far from satisfactorily as we observe that 1) deep encoder layers focus on irrelevant background regions and 2) the models' confidence in the change regions is inconsistent at different decoder stages. The first problem is because deep encoder layers cannot effectively learn from imbalanced change categories using the sole output supervision, while the second problem is attributed to the lack of explicit semantic consistency preservation. To address these issues, we design a novel similarity-aware attention flow network (SAAN). SAAN incorporates a similarity-guided attention flow module with deeply supervised similarity optimization to achieve effective change detection. Specifically, we counter the first issue by explicitly guiding deep encoder layers to discover semantic relations from bi-temporal input images using deeply supervised similarity optimization. The extracted features are optimized to be semantically similar in the unchanged regions and dissimilar in the changing regions. The second drawback can be alleviated by the proposed similarity-guided attention flow module, which incorporates similarity-guided attention modules and attention flow mechanisms to guide the model to focus on discriminative channels and regions. We evaluated the effectiveness and generalization ability of the proposed method by conducting experiments on a wide range of CD tasks. The experimental results demonstrate that our method achieves excellent performance on several CD tasks, with discriminative features and semantic consistency preserved.

* 15 pages,13 figures 
Viaarxiv icon

T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images

Aug 04, 2023
Huan Zhong, Chen Wu

Figure 1 for T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images
Figure 2 for T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images
Figure 3 for T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images
Figure 4 for T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images

Remote sensing image change detection aims to identify the differences between images acquired at different times in the same area. It is widely used in land management, environmental monitoring, disaster assessment and other fields. Currently, most change detection methods are based on Siamese network structure or early fusion structure. Siamese structure focuses on extracting object features at different times but lacks attention to change information, which leads to false alarms and missed detections. Early fusion (EF) structure focuses on extracting features after the fusion of images of different phases but ignores the significance of object features at different times for detecting change details, making it difficult to accurately discern the edges of changed objects. To address these issues and obtain more accurate results, we propose a novel network, Triplet UNet(T-UNet), based on a three-branch encoder, which is capable to simultaneously extract the object features and the change features between the pre- and post-time-phase images through triplet encoder. To effectively interact and fuse the features extracted from the three branches of triplet encoder, we propose a multi-branch spatial-spectral cross-attention module (MBSSCA). In the decoder stage, we introduce the channel attention mechanism (CAM) and spatial attention mechanism (SAM) to fully mine and integrate detailed textures information at the shallow layer and semantic localization information at the deep layer.

* 21 pages, 11 figures, 6 tables 
Viaarxiv icon

Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection

Jul 27, 2023
Zhenghui Zhao, Lixiang Ru, Chen Wu

Figure 1 for Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection
Figure 2 for Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection
Figure 3 for Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection
Figure 4 for Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection

Weakly-supervised change detection (WSCD) aims to detect pixel-level changes with only image-level annotations. Owing to its label efficiency, WSCD is drawing increasing attention recently. However, current WSCD methods often encounter the challenge of change missing and fabricating, i.e., the inconsistency between image-level annotations and pixel-level predictions. Specifically, change missing refer to the situation that the WSCD model fails to predict any changed pixels, even though the image-level label indicates changed, and vice versa for change fabricating. To address this challenge, in this work, we leverage global-scale and local-scale priors in WSCD and propose two components: a Dilated Prior (DP) decoder and a Label Gated (LG) constraint. The DP decoder decodes samples with the changed image-level label, skips samples with the unchanged label, and replaces them with an all-unchanged pixel-level label. The LG constraint is derived from the correspondence between changed representations and image-level labels, penalizing the model when it mispredicts the change status. Additionally, we develop TransWCD, a simple yet powerful transformer-based model, showcasing the potential of weakly-supervised learning in change detection. By integrating the DP decoder and LG constraint into TransWCD, we form TransWCD-DL. Our proposed TransWCD and TransWCD-DL achieve significant +6.33% and +9.55% F1 score improvements over the state-of-the-art methods on the WHU-CD dataset, respectively. Some performance metrics even exceed several fully-supervised change detection (FSCD) competitors. Code will be available at https://github.com/zhenghuizhao/TransWCD.

* Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence 
Viaarxiv icon

Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction

Jul 23, 2023
Haonan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang

Figure 1 for Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction
Figure 2 for Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction
Figure 3 for Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction
Figure 4 for Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction

Buildings are the basic carrier of social production and human life; roads are the links that interconnect social networks. Building and road information has important application value in the frontier fields of regional coordinated development, disaster prevention, auto-driving, etc. Mapping buildings and roads from very high-resolution (VHR) remote sensing images have become a hot research topic. However, the existing methods often ignore the strong spatial correlation between roads and buildings and extract them in isolation. To fully utilize the complementary advantages between buildings and roads, we propose a building-road collaborative extraction method based on multi-task and cross-scale feature interaction to improve the accuracy of both tasks in a complementary way. A multi-task interaction module is proposed to interact information across tasks and preserve the unique information of each task, which tackle the seesaw phenomenon in multitask learning. By considering the variation in appearance and structure between buildings and roads, a cross-scale interaction module is designed to automatically learn the optimal reception field for different tasks. Compared with many existing methods that train each task individually, the proposed collaborative extraction method can utilize the complementary advantages between buildings and roads by the proposed inter-task and inter-scale feature interactions, and automatically select the optimal reception field for different tasks. Experiments on a wide range of urban and rural scenarios show that the proposed algorithm can achieve building-road extraction with outstanding performance and efficiency.

* 34 pages,9 figures, submitted to ISPRS Journal of Photogrammetry and Remote Sensing 
Viaarxiv icon

Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision

Jul 23, 2023
Haonan Guo, Bo Du, Chen Wu, Xin Su, Liangpei Zhang

Figure 1 for Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision
Figure 2 for Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision
Figure 3 for Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision
Figure 4 for Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision

The efficacy of building footprint segmentation from remotely sensed images has been hindered by model transfer effectiveness. Many existing building segmentation methods were developed upon the encoder-decoder architecture of U-Net, in which the encoder is finetuned from the newly developed backbone networks that are pre-trained on ImageNet. However, the heavy computational burden of the existing decoder designs hampers the successful transfer of these modern encoder networks to remote sensing tasks. Even the widely-adopted deep supervision strategy fails to mitigate these challenges due to its invalid loss in hybrid regions where foreground and background pixels are intermixed. In this paper, we conduct a comprehensive evaluation of existing decoder network designs for building footprint segmentation and propose an efficient framework denoted as BFSeg to enhance learning efficiency and effectiveness. Specifically, a densely-connected coarse-to-fine feature fusion decoder network that facilitates easy and fast feature fusion across scales is proposed. Moreover, considering the invalidity of hybrid regions in the down-sampled ground truth during the deep supervision process, we present a lenient deep supervision and distillation strategy that enables the network to learn proper knowledge from deep supervision. Building upon these advancements, we have developed a new family of building segmentation networks, which consistently surpass prior works with outstanding performance and efficiency across a wide range of newly developed encoder networks. The code will be released on https://github.com/HaonanGuo/BFSeg-Efficient-Building-Footprint-Segmentation-Framework.

* 13 pages,8 figures. Submitted to IEEE Transactions on Neural Networks and Learning Systems 
Viaarxiv icon

DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space

Jul 23, 2023
Haonan Guo, Bo Du, Chen Wu, Chengxi Han, Liangpei Zhang

Figure 1 for DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space
Figure 2 for DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space
Figure 3 for DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space
Figure 4 for DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space

Change detection (CD) is an important yet challenging task in the Earth observation field for monitoring Earth surface dynamics. The advent of deep learning techniques has recently propelled automatic CD into a technological revolution. Nevertheless, deep learning-based CD methods are still plagued by two primary issues: 1) insufficient temporal relationship modeling and 2) pseudo-change misclassification. To address these issues, we complement the strong temporal modeling ability of metric learning with the prominent fitting ability of segmentation and propose a deep change feature learning (DeepCL) framework for robust and explainable CD. Firstly, we designed a hard sample-aware contrastive loss, which reweights the importance of hard and simple samples. This loss allows for explicit modeling of the temporal correlation between bi-temporal remote sensing images. Furthermore, the modeled temporal relations are utilized as knowledge prior to guide the segmentation process for detecting change regions. The DeepCL framework is thoroughly evaluated both theoretically and experimentally, demonstrating its superior feature discriminability, resilience against pseudo changes, and adaptability to a variety of CD algorithms. Extensive comparative experiments substantiate the quantitative and qualitative superiority of DeepCL over state-of-the-art CD approaches.

* 12 pages,7 figures, submitted to IEEE Transactions on Image Processing 
Viaarxiv icon

Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices

May 05, 2023
Chen Wu, Peizheng Cai, Zhiqiang Zhong, Yali Zheng

Figure 1 for Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices
Figure 2 for Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices
Figure 3 for Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices
Figure 4 for Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices

Photoplethysmogram (PPG) signals are easily contaminated by motion artifacts in real-world settings, despite their widespread use in Internet-of-Things (IoT) based wearable and smart health devices for cardiovascular health monitoring. This study proposed a lightweight deep neural network, called Tiny-PPG, for accurate and real-time PPG artifact segmentation on IoT edge devices. The model was trained and tested on a public dataset, PPG DaLiA, which featured complex artifacts with diverse lengths and morphologies during various daily activities of 15 subjects using a watch-type device (Empatica E4). The model structure, training method and loss function were specifically designed to balance detection accuracy and speed for real-time PPG artifact detection in resource-constrained embedded devices. To optimize the model size and capability in multi-scale feature representation, the model employed deep separable convolution and atrous spatial pyramid pooling modules, respectively. Additionally, the contrastive loss was also utilized to further optimize the feature embeddings. With additional model pruning, Tiny-PPG achieved state-of-the-art detection accuracy of 87.8% while only having 19,726 model parameters (0.15 megabytes), and was successfully deployed on an STM32 embedded system for real-time PPG artifact detection. Therefore, this study provides an effective solution for resource-constraint IoT smart health devices in PPG artifact detection.

Viaarxiv icon

GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection

Apr 18, 2023
Meiqi Hu, Chen Wu, Liangpei Zhang

Figure 1 for GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection
Figure 2 for GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection
Figure 3 for GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection
Figure 4 for GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection

High spectral resolution imagery of the Earth's surface enables users to monitor changes over time in fine-grained scale, playing an increasingly important role in agriculture, defense, and emergency response. However, most current algorithms are still confined to describing local features and fail to incorporate a global perspective, which limits their ability to capture interactions between global features, thus usually resulting in incomplete change regions. In this paper, we propose a Global Multi-head INteractive self-attention change Detection network (GlobalMind) to explore the implicit correlation between different surface objects and variant land cover transformations, acquiring a comprehensive understanding of the data and accurate change detection result. Firstly, a simple but effective Global Axial Segmentation (GAS) strategy is designed to expand the self-attention computation along the row space or column space of hyperspectral images, allowing the global connection with high efficiency. Secondly, with GAS, the global spatial multi-head interactive self-attention (Global-M) module is crafted to mine the abundant spatial-spectral feature involving potential correlations between the ground objects from the entire rich and complex hyperspectral space. Moreover, to acquire the accurate and complete cross-temporal changes, we devise a global temporal interactive multi-head self-attention (GlobalD) module which incorporates the relevance and variation of bi-temporal spatial-spectral features, deriving the integrate potential same kind of changes in the local and global range with the combination of GAS. We perform extensive experiments on five mostly used hyperspectral datasets, and our method outperforms the state-of-the-art algorithms with high accuracy and efficiency.

* 14 page, 18 figures 
Viaarxiv icon