Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gui-Song Xia

Event-based Synthetic Aperture Imaging with a Hybrid Network

Mar 30, 2021

Xiang Zhang, Wei Liao, Lei Yu, Wen Yang, Gui-Song Xia

Figure 1 for Event-based Synthetic Aperture Imaging with a Hybrid Network

Figure 2 for Event-based Synthetic Aperture Imaging with a Hybrid Network

Figure 3 for Event-based Synthetic Aperture Imaging with a Hybrid Network

Figure 4 for Event-based Synthetic Aperture Imaging with a Hybrid Network

Abstract:Synthetic aperture imaging (SAI) is able to achieve the see through effect by blurring out the off-focus foreground occlusions and reconstructing the in-focus occluded targets from multi-view images. However, very dense occlusions and extreme lighting conditions may bring significant disturbances to the SAI based on conventional frame-based cameras, leading to performance degeneration. To address these problems, we propose a novel SAI system based on the event camera which can produce asynchronous events with extremely low latency and high dynamic range. Thus, it can eliminate the interference of dense occlusions by measuring with almost continuous views, and simultaneously tackle the over/under exposure problems. To reconstruct the occluded targets, we propose a hybrid encoder-decoder network composed of spiking neural networks (SNNs) and convolutional neural networks (CNNs). In the hybrid network, the spatio-temporal information of the collected events is first encoded by SNN layers, and then transformed to the visual image of the occluded targets by a style-transfer CNN decoder. Through experiments, the proposed method shows remarkable performance in dealing with very dense occlusions and extreme lighting conditions, and high quality visual images can be reconstructed using pure event data.

Via

Access Paper or Ask Questions

Deep Graph Matching under Quadratic Constraint

Mar 14, 2021

Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia

Figure 1 for Deep Graph Matching under Quadratic Constraint

Figure 2 for Deep Graph Matching under Quadratic Constraint

Figure 3 for Deep Graph Matching under Quadratic Constraint

Figure 4 for Deep Graph Matching under Quadratic Constraint

Abstract:Recently, deep learning based methods have demonstrated promising results on the graph matching problem, by relying on the descriptive capability of deep features extracted on graph nodes. However, one main limitation with existing deep graph matching (DGM) methods lies in their ignorance of explicit constraint of graph structures, which may lead the model to be trapped into local minimum in training. In this paper, we propose to explicitly formulate pairwise graph structures as a \textbf{quadratic constraint} incorporated into the DGM framework. The quadratic constraint minimizes the pairwise structural discrepancy between graphs, which can reduce the ambiguities brought by only using the extracted CNN features. Moreover, we present a differentiable implementation to the quadratic constrained-optimization such that it is compatible with the unconstrained deep learning optimizer. To give more precise and proper supervision, a well-designed false matching loss against class imbalance is proposed, which can better penalize the false negatives and false positives with less overfitting. Exhaustive experiments demonstrate that our method competitive performance on real-world datasets.

Via

Access Paper or Ask Questions

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Mar 13, 2021

Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia

Figure 1 for ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Figure 2 for ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Figure 3 for ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Figure 4 for ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Abstract:Recently, object detection in aerial images has gained much attention in computer vision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detector requires more parameters to encode the orientation information, which are often highly redundant and inefficient. Moreover, as ordinary CNNs do not explicitly model the orientation variation, large amounts of rotation augmented data is needed to train an accurate object detector. In this paper, we propose a Rotation-equivariant Detector (ReDet) to address these issues, which explicitly encodes rotation equivariance and rotation invariance. More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size. Based on the rotation-equivariant features, we also present Rotation-invariant RoI Align (RiRoI Align), which adaptively extracts rotation-invariant features from equivariant features according to the orientation of RoI. Extensive experiments on several challenging aerial image datasets DOTA-v1.0, DOTA-v1.5 and HRSC2016, show that our method can achieve state-of-the-art performance on the task of aerial object detection. Compared with previous best results, our ReDet gains 1.2, 3.5 and 2.6 mAP on DOTA-v1.0, DOTA-v1.5 and HRSC2016 respectively while reducing the number of parameters by 60\% (313 Mb vs. 121 Mb). The code is available at: \url{https://github.com/csuhan/ReDet}.

* Accepted by CVPR2021

Via

Access Paper or Ask Questions

Unsupervised Pretraining for Object Detection by Patch Reidentification

Mar 08, 2021

Jian Ding, Enze Xie, Hang Xu, Chenhan Jiang, Zhenguo Li, Ping Luo, Gui-Song Xia

Figure 1 for Unsupervised Pretraining for Object Detection by Patch Reidentification

Figure 2 for Unsupervised Pretraining for Object Detection by Patch Reidentification

Figure 3 for Unsupervised Pretraining for Object Detection by Patch Reidentification

Figure 4 for Unsupervised Pretraining for Object Detection by Patch Reidentification

Abstract:Unsupervised representation learning achieves promising performances in pre-training representations for object detectors. However, previous approaches are mainly designed for image-level classification, leading to suboptimal detection performance. To bridge the performance gap, this work proposes a simple yet effective representation learning method for object detection, named patch re-identification (Re-ID), which can be treated as a contrastive pretext task to learn location-discriminative representation unsupervisedly, possessing appealing advantages compared to its counterparts. Firstly, unlike fully-supervised person Re-ID that matches a human identity in different camera views, patch Re-ID treats an important patch as a pseudo identity and contrastively learns its correspondence in two different image views, where the pseudo identity has different translations and transformations, enabling to learn discriminative features for object detection. Secondly, patch Re-ID is performed in Deeply Unsupervised manner to learn multi-level representations, appealing to object detection. Thirdly, extensive experiments show that our method significantly outperforms its counterparts on COCO in all settings, such as different training iterations and data percentages. For example, Mask R-CNN initialized with our representation surpasses MoCo v2 and even its fully-supervised counterparts in all setups of training iterations (e.g. 2.1 and 1.1 mAP improvement compared to MoCo v2 in 12k and 90k iterations respectively). Code will be released at https://github.com/dingjiansw101/DUPR.

Via

Access Paper or Ask Questions

Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Feb 24, 2021

Jian Ding, Nan Xue, Gui-Song Xia, Xiang Bai, Wen Yang, Micheal Ying Yang, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo(+1 more)

Figure 1 for Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Figure 2 for Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Figure 3 for Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Figure 4 for Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Abstract:In the past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird's-eye view of aerial images. More importantly, the lack of large-scale benchmarks becomes a major obstacle to the development of object detection in aerial images (ODAI). In this paper, we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI. The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images. Based on this large-scale and well-annotated dataset, we build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated. Furthermore, we provide a uniform code library for ODAI and build a website for testing and evaluating different algorithms. Previous challenges run on DOTA have attracted more than 1300 teams worldwide. We believe that the expanded large-scale DOTA dataset, the extensive baselines, the code library and the challenges can facilitate the designs of robust algorithms and reproducible research on the problem of object detection in aerial images.

* 15 pages, 9 figures

Via

Access Paper or Ask Questions

Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

Feb 05, 2021

Ye Lyu, George Vosselman, Gui-Song Xia, Michael Ying Yang

Figure 1 for Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

Figure 2 for Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

Figure 3 for Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

Figure 4 for Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

Abstract:Semantic segmentation for aerial platforms has been one of the fundamental scene understanding task for the earth observation. Most of the semantic segmentation research focused on scenes captured in nadir view, in which objects have relatively smaller scale variation compared with scenes captured in oblique view. The huge scale variation of objects in oblique images limits the performance of deep neural networks (DNN) that process images in a single scale fashion. In order to tackle the scale variation issue, in this paper, we propose the novel bidirectional multi-scale attention networks, which fuse features from multiple scales bidirectionally for more adaptive and effective feature extraction. The experiments are conducted on the UAVid2020 dataset and have shown the effectiveness of our method. Our model achieved the state-of-the-art (SOTA) result with a mean intersection over union (mIoU) score of 70.80%.

Via

Access Paper or Ask Questions

Unmixing Convolutional Features for Crisp Edge Detection

Nov 19, 2020

Linxi Huan, Xianwei Zheng, Nan Xue, Wei He, Jianya Gong, Gui-Song Xia

Figure 1 for Unmixing Convolutional Features for Crisp Edge Detection

Figure 2 for Unmixing Convolutional Features for Crisp Edge Detection

Figure 3 for Unmixing Convolutional Features for Crisp Edge Detection

Figure 4 for Unmixing Convolutional Features for Crisp Edge Detection

Abstract:This paper presents a context-aware tracing strategy (CATS) for crisp edge detection with deep edge detectors, based on an observation that the localization ambiguity of deep edge detectors is mainly caused by the mixing phenomenon of convolutional neural networks: feature mixing in edge classification and side mixing during fusing side predictions. The CATS consists of two modules: a novel tracing loss that performs feature unmixing by tracing boundaries for better side edge learning, and a context-aware fusion block that tackles the side mixing by aggregating the complementary merits of learned side edges. Experiments demonstrate that the proposed CATS can be integrated into modern deep edge detectors to improve localization accuracy. With the vanilla VGG16 backbone, in terms of BSDS500 dataset, our CATS improves the F-measure (ODS) of the RCF and BDCN deep edge detectors by 12% and 6% respectively when evaluating without using the morphological non-maximal suppression scheme for edge detection.

Via

Access Paper or Ask Questions

Asymmetric Siamese Networks for Semantic Change Detection

Oct 12, 2020

Kunping Yang, Gui-Song Xia, Zicheng Liu, Bo Du, Wen Yang, Marcello Pelillo

Figure 1 for Asymmetric Siamese Networks for Semantic Change Detection

Figure 2 for Asymmetric Siamese Networks for Semantic Change Detection

Figure 3 for Asymmetric Siamese Networks for Semantic Change Detection

Figure 4 for Asymmetric Siamese Networks for Semantic Change Detection

Abstract:Given two multi-temporal aerial images, semantic change detection aims to locate the land-cover variations and identify their categories with pixel-wise boundaries. The problem has demonstrated promising potentials in many earth vision related tasks, such as precise urban planning and natural resource management. Existing state-of-the-art algorithms mainly identify the changed pixels through symmetric modules, which would suffer from categorical ambiguity caused by changes related to totally different land-cover distributions. In this paper, we present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures, which involve different spatial ranges and quantities of parameters to factor in the discrepancy across different land-cover distributions. To better train and evaluate our model, we create a large-scale well-annotated SEmantic Change detectiON Dataset (SECOND), while an adaptive threshold learning (ATL) module and a separated kappa (SeK) coefficient are proposed to alleviate the influences of label imbalance in model training and evaluation. The experimental results demonstrate that the proposed model can stably outperform the state-of-the-art algorithms with different encoder backbones.

Via

Access Paper or Ask Questions

Mixed Noise Removal with Pareto Prior

Aug 27, 2020

Zhou Liu, Lei Yu, Gui-Song Xia, Hong Sun

Figure 1 for Mixed Noise Removal with Pareto Prior

Figure 2 for Mixed Noise Removal with Pareto Prior

Figure 3 for Mixed Noise Removal with Pareto Prior

Figure 4 for Mixed Noise Removal with Pareto Prior

Abstract:Denoising images contaminated by the mixture of additive white Gaussian noise (AWGN) and impulse noise (IN) is an essential but challenging problem. The presence of impulsive disturbances inevitably affects the distribution of noises and thus largely degrades the performance of traditional AWGN denoisers. Existing methods target to compensate the effects of IN by introducing a weighting matrix, which, however, is lack of proper priori and thus hard to be accurately estimated. To address this problem, we exploit the Pareto distribution as the priori of the weighting matrix, based on which an accurate and robust weight estimator is proposed for mixed noise removal. Particularly, a relatively small portion of pixels are assumed to be contaminated with IN, which should have weights with small values and then be penalized out. This phenomenon can be properly described by the Pareto distribution of type 1. Therefore, armed with the Pareto distribution, we formulate the problem of mixed noise removal in the Bayesian framework, where nonlocal self-similarity priori is further exploited by adopting nonlocal low rank approximation. Compared to existing methods, the proposed method can estimate the weighting matrix adaptively, accurately, and robust for different level of noises, thus can boost the denoising performance. Experimental results on widely used image datasets demonstrate the superiority of our proposed method to the state-of-the-arts.

Via

Access Paper or Ask Questions

Align Deep Features for Oriented Object Detection

Aug 21, 2020

Jiaming Han, Jian Ding, Jie Li, Gui-Song Xia

Figure 1 for Align Deep Features for Oriented Object Detection

Figure 2 for Align Deep Features for Oriented Object Detection

Figure 3 for Align Deep Features for Oriented Object Detection

Figure 4 for Align Deep Features for Oriented Object Detection

Abstract:The past decade has witnessed significant progress on detecting objects in aerial images that are often distributed with large scale variations and arbitrary orientations. However most of existing methods rely on heuristically defined anchors with different scales, angles and aspect ratios and usually suffer from severe misalignment between anchor boxes and axis-aligned convolutional features, which leads to the common inconsistency between the classification score and localization accuracy. To address this issue, we propose a Single-shot Alignment Network (S$^2$A-Net) consisting of two modules: a Feature Alignment Module (FAM) and an Oriented Detection Module (ODM). The FAM can generate high-quality anchors with an Anchor Refinement Network and adaptively align the convolutional features according to the anchor boxes with a novel Alignment Convolution. The ODM first adopts active rotating filters to encode the orientation information and then produces orientation-sensitive and orientation-invariant features to alleviate the inconsistency between classification score and localization accuracy. Besides, we further explore the approach to detect objects in large-size images, which leads to a better trade-off between speed and accuracy. Extensive experiments demonstrate that our method can achieve state-of-the-art performance on two commonly used aerial objects datasets (i.e., DOTA and HRSC2016) while keeping high efficiency. The code is available at https://github.com/csuhan/s2anet.

Via

Access Paper or Ask Questions