Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiao Xiang Zhu

Technical University of Munich

Deep Semantic Model Fusion for Ancient Agricultural Terrace Detection

Aug 04, 2023

Yi Wang, Chenying Liu, Arti Tiwari, Micha Silver, Arnon Karnieli, Xiao Xiang Zhu, Conrad M Albrecht

Abstract:Discovering ancient agricultural terraces in desert regions is important for the monitoring of long-term climate changes on the Earth's surface. However, traditional ground surveys are both costly and limited in scale. With the increasing accessibility of aerial and satellite data, machine learning techniques bear large potential for the automatic detection and recognition of archaeological landscapes. In this paper, we propose a deep semantic model fusion method for ancient agricultural terrace detection. The input data includes aerial images and LiDAR generated terrain features in the Negev desert. Two deep semantic segmentation models, namely DeepLabv3+ and UNet, with EfficientNet backbone, are trained and fused to provide segmentation maps of ancient terraces and walls. The proposed method won the first prize in the International AI Archaeology Challenge. Codes are available at https://github.com/wangyi111/international-archaeology-ai-challenge.

* IEEE Big Data 2022 workshop on Digital Twins for Accelerated Discovery of Climate & Sustainability Solutions (ADoCS)

Via

Access Paper or Ask Questions

UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation

Aug 02, 2023

Qingsong Xu, Yilei Shi, Jianhua Guo, Chaojun Ouyang, Xiao Xiang Zhu

Figure 1 for UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation

Figure 2 for UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation

Figure 3 for UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation

Figure 4 for UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation

Abstract:Change detection (CD) by comparing two bi-temporal images is a crucial task in remote sensing. With the advantages of requiring no cumbersome labeled change information, unsupervised CD has attracted extensive attention in the community. However, existing unsupervised CD approaches rarely consider the seasonal and style differences incurred by the illumination and atmospheric conditions in multi-temporal images. To this end, we propose a change detection with domain shift setting for remote sensing images. Furthermore, we present a novel unsupervised CD method using a light-weight transformer, called UCDFormer. Specifically, a transformer-driven image translation composed of a light-weight transformer and a domain-specific affinity weight is first proposed to mitigate domain shift between two images with real-time efficiency. After image translation, we can generate the difference map between the translated before-event image and the original after-event image. Then, a novel reliable pixel extraction module is proposed to select significantly changed/unchanged pixel positions by fusing the pseudo change maps of fuzzy c-means clustering and adaptive threshold. Finally, a binary change map is obtained based on these selected pixel pairs and a binary classifier. Experimental results on different unsupervised CD tasks with seasonal and style changes demonstrate the effectiveness of the proposed UCDFormer. For example, compared with several other related methods, UCDFormer improves performance on the Kappa coefficient by more than 12\%. In addition, UCDFormer achieves excellent performance for earthquake-induced landslide detection when considering large-scale applications. The code is available at \url{https://github.com/zhu-xlab/UCDFormer}

* 16 pages, 7 figures, IEEE Transactions on Geoscience and Remote Sensing

Via

Access Paper or Ask Questions

PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds

Jul 17, 2023

Zhaiyu Chen, Yilei Shi, Liangliang Nan, Zhitong Xiong, Xiao Xiang Zhu

Figure 1 for PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds

Figure 2 for PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds

Figure 3 for PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds

Figure 4 for PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds

Abstract:We present PolyGNN, a polyhedron-based graph neural network for 3D building reconstruction from point clouds. PolyGNN learns to assemble primitives obtained by polyhedral decomposition via graph node classification, achieving a watertight, compact, and weakly semantic reconstruction. To effectively represent arbitrary-shaped polyhedra in the neural network, we propose three different sampling strategies to select representative points as polyhedron-wise queries, enabling efficient occupancy inference. Furthermore, we incorporate the inter-polyhedron adjacency to enhance the classification of the graph nodes. We also observe that existing city-building models are abstractions of the underlying instances. To address this abstraction gap and provide a fair evaluation of the proposed method, we develop our method on a large-scale synthetic dataset covering 500k+ buildings with well-defined ground truths of polyhedral class labels. We further conduct a transferability analysis across cities and on real-world point clouds. Both qualitative and quantitative results demonstrate the effectiveness of our method, particularly its efficiency for large-scale reconstructions. The source code and data of our work are available at https://github.com/chenzhaiyu/polygnn.

Via

Access Paper or Ask Questions

A Deep Active Contour Model for Delineating Glacier Calving Fronts

Jul 07, 2023

Konrad Heidler, Lichao Mou, Erik Loebel, Mirko Scheinert, Sébastien Lefèvre, Xiao Xiang Zhu

Figure 1 for A Deep Active Contour Model for Delineating Glacier Calving Fronts

Figure 2 for A Deep Active Contour Model for Delineating Glacier Calving Fronts

Figure 3 for A Deep Active Contour Model for Delineating Glacier Calving Fronts

Figure 4 for A Deep Active Contour Model for Delineating Glacier Calving Fronts

Abstract:Choosing how to encode a real-world problem as a machine learning task is an important design decision in machine learning. The task of glacier calving front modeling has often been approached as a semantic segmentation task. Recent studies have shown that combining segmentation with edge detection can improve the accuracy of calving front detectors. Building on this observation, we completely rephrase the task as a contour tracing problem and propose a model for explicit contour detection that does not incorporate any dense predictions as intermediate steps. The proposed approach, called ``Charting Outlines by Recurrent Adaptation'' (COBRA), combines Convolutional Neural Networks (CNNs) for feature extraction and active contour models for the delineation. By training and evaluating on several large-scale datasets of Greenland's outlet glaciers, we show that this approach indeed outperforms the aforementioned methods based on segmentation and edge-detection. Finally, we demonstrate that explicit contour detection has benefits over pixel-wise methods when quantifying the models' prediction uncertainties. The project page containing the code and animated model predictions can be found at \url{https://khdlr.github.io/COBRA/}.

* This work has been accepted by IEEE TGRS for publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Semi-Supervised Learning for hyperspectral images by non parametrically predicting view assignment

Jun 19, 2023

Shivam Pande, Nassim Ait Ali Braham, Yi Wang, Conrad M Albrecht, Biplab Banerjee, Xiao Xiang Zhu

Abstract:Hyperspectral image (HSI) classification is gaining a lot of momentum in present time because of high inherent spectral information within the images. However, these images suffer from the problem of curse of dimensionality and usually require a large number samples for tasks such as classification, especially in supervised setting. Recently, to effectively train the deep learning models with minimal labelled samples, the unlabeled samples are also being leveraged in self-supervised and semi-supervised setting. In this work, we leverage the idea of semi-supervised learning to assist the discriminative self-supervised pretraining of the models. The proposed method takes different augmented views of the unlabeled samples as input and assigns them the same pseudo-label corresponding to the labelled sample from the downstream task. We train our model on two HSI datasets, namely Houston dataset (from data fusion contest, 2013) and Pavia university dataset, and show that the proposed approach performs better than self-supervised approach and supervised training.

* The paper was submitted in IGARSS, 2023 conference and is not accepted to appear in the proceedings. The page requirement is 4 pages, including references

Via

Access Paper or Ask Questions

DisasterNets: Embedding Machine Learning in Disaster Mapping

Jun 16, 2023

Qingsong Xu, Yilei Shi, Xiao Xiang Zhu

Figure 1 for DisasterNets: Embedding Machine Learning in Disaster Mapping

Figure 2 for DisasterNets: Embedding Machine Learning in Disaster Mapping

Figure 3 for DisasterNets: Embedding Machine Learning in Disaster Mapping

Figure 4 for DisasterNets: Embedding Machine Learning in Disaster Mapping

Abstract:Disaster mapping is a critical task that often requires on-site experts and is time-consuming. To address this, a comprehensive framework is presented for fast and accurate recognition of disasters using machine learning, termed DisasterNets. It consists of two stages, space granulation and attribute granulation. The space granulation stage leverages supervised/semi-supervised learning, unsupervised change detection, and domain adaptation with/without source data techniques to handle different disaster mapping scenarios. Furthermore, the disaster database with the corresponding geographic information field properties is built by using the attribute granulation stage. The framework is applied to earthquake-triggered landslide mapping and large-scale flood mapping. The results demonstrate a competitive performance for high-precision, high-efficiency, and cross-scene recognition of disasters. To bridge the gap between disaster mapping and machine learning communities, we will provide an openly accessible tool based on DisasterNets. The framework and tool will be available at https://github.com/HydroPML/DisasterNets.

* 4 pages, IEEE IGARSS 2023

Via

Access Paper or Ask Questions

RRSIS: Referring Remote Sensing Image Segmentation

Jun 14, 2023

Zhenghang Yuan, Lichao Mou, Yuansheng Hua, Xiao Xiang Zhu

Figure 1 for RRSIS: Referring Remote Sensing Image Segmentation

Figure 2 for RRSIS: Referring Remote Sensing Image Segmentation

Figure 3 for RRSIS: Referring Remote Sensing Image Segmentation

Figure 4 for RRSIS: Referring Remote Sensing Image Segmentation

Abstract:Localizing desired objects from remote sensing images is of great use in practical applications. Referring image segmentation, which aims at segmenting out the objects to which a given expression refers, has been extensively studied in natural images. However, almost no research attention is given to this task of remote sensing imagery. Considering its potential for real-world applications, in this paper, we introduce referring remote sensing image segmentation (RRSIS) to fill in this gap and make some insightful explorations. Specifically, we create a new dataset, called RefSegRS, for this task, enabling us to evaluate different methods. Afterward, we benchmark referring image segmentation methods of natural images on the RefSegRS dataset and find that these models show limited efficacy in detecting small and scattered objects. To alleviate this issue, we propose a language-guided cross-scale enhancement (LGCE) module that utilizes linguistic features to adaptively enhance multi-scale visual features by integrating both deep and shallow features. The proposed dataset, benchmarking results, and the designed LGCE module provide insights into the design of a better RRSIS model. We will make our dataset and code publicly available.

Via

Access Paper or Ask Questions

GEO-Bench: Toward Foundation Models for Earth Monitoring

Jun 06, 2023

Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin(+7 more)

Figure 1 for GEO-Bench: Toward Foundation Models for Earth Monitoring

Figure 2 for GEO-Bench: Toward Foundation Models for Earth Monitoring

Figure 3 for GEO-Bench: Toward Foundation Models for Earth Monitoring

Figure 4 for GEO-Bench: Toward Foundation Models for Earth Monitoring

Abstract:Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.

* arXiv admin note: text overlap with arXiv:2112.00570

Via

Access Paper or Ask Questions

RSSOD-Bench: A large-scale benchmark dataset for Salient Object Detection in Optical Remote Sensing Imagery

Jun 04, 2023

Zhitong Xiong, Yanfeng Liu, Qi Wang, Xiao Xiang Zhu

Abstract:We present the RSSOD-Bench dataset for salient object detection (SOD) in optical remote sensing imagery. While SOD has achieved success in natural scene images with deep learning, research in SOD for remote sensing imagery (RSSOD) is still in its early stages. Existing RSSOD datasets have limitations in terms of scale, and scene categories, which make them misaligned with real-world applications. To address these shortcomings, we construct the RSSOD-Bench dataset, which contains images from four different cities in the USA. The dataset provides annotations for various salient object categories, such as buildings, lakes, rivers, highways, bridges, aircraft, ships, athletic fields, and more. The salient objects in RSSOD-Bench exhibit large-scale variations, cluttered backgrounds, and different seasons. Unlike existing datasets, RSSOD-Bench offers uniform distribution across scene categories. We benchmark 23 different state-of-the-art approaches from both the computer vision and remote sensing communities. Experimental results demonstrate that more research efforts are required for the RSSOD task.

* IGARSS 2023, 4 pages

Via

Access Paper or Ask Questions

Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

Jun 01, 2023

Zhenghang Yuan, Lichao Mou, Xiao Xiang Zhu

Figure 1 for Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

Figure 2 for Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

Figure 3 for Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

Figure 4 for Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

Abstract:The Visual Question Answering (VQA) system offers a user-friendly interface and enables human-computer interaction. However, VQA models commonly face the challenge of language bias, resulting from the learned superficial correlation between questions and answers. To address this issue, in this study, we present a novel framework to reduce the language bias of the VQA for remote sensing data (RSVQA). Specifically, we add an adversarial branch to the original VQA framework. Based on the adversarial branch, we introduce two regularizers to constrain the training process against language bias. Furthermore, to evaluate the performance in terms of language bias, we propose a new metric that combines standard accuracy with the performance drop when incorporating question and random image information. Experimental results demonstrate the effectiveness of our method. We believe that our method can shed light on future work for reducing language bias on the RSVQA task.

Via

Access Paper or Ask Questions