Alert button
Picture for Liping Hou

Liping Hou

Alert button

G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection

May 24, 2022
Liping Hou, Ke Lu, Xue Yang, Yuqiu Li, Jian Xue

Figure 1 for G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection
Figure 2 for G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection
Figure 3 for G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection
Figure 4 for G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection

Arbitrary-oriented object representations contain the oriented bounding box (OBB), quadrilateral bounding box (QBB), and point set (PointSet). Each representation encounters problems that correspond to its characteristics, such as the boundary discontinuity, square-like problem, representation ambiguity, and isolated points, which lead to inaccurate detection. Although many effective strategies have been proposed for various representations, there is still no unified solution. Current detection methods based on Gaussian modeling have demonstrated the possibility of breaking this dilemma; however, they remain limited to OBB. To go further, in this paper, we propose a unified Gaussian representation called G-Rep to construct Gaussian distributions for OBB, QBB, and PointSet, which achieves a unified solution to various representations and problems. Specifically, PointSet or QBB-based objects are converted into Gaussian distributions, and their parameters are optimized using the maximum likelihood estimation algorithm. Then, three optional Gaussian metrics are explored to optimize the regression loss of the detector because of their excellent parameter optimization mechanisms. Furthermore, we also use Gaussian metrics for sampling to align label assignment and regression loss. Experimental results on several public available datasets, DOTA, HRSC2016, UCAS-AOD, and ICDAR2015 show the excellent performance of the proposed method for arbitrary-oriented object detection. The code has been open sourced at https://github.com/open-mmlab/mmrotate.

* 14 pages, 6 figures, 8 tables, the code has been open sourced at https://github.com/open-mmlab/mmrotate 
Viaarxiv icon

MMRotate: A Rotated Object Detection Benchmark using Pytorch

Apr 28, 2022
Yue Zhou, Xue Yang, Gefan Zhang, Jiabao Wang, Yanyi Liu, Liping Hou, Xue Jiang, Xingzhao Liu, Junchi Yan, Chengqi Lyu, Wenwei Zhang, Kai Chen

Figure 1 for MMRotate: A Rotated Object Detection Benchmark using Pytorch
Figure 2 for MMRotate: A Rotated Object Detection Benchmark using Pytorch

We present an open-source toolbox, named MMRotate, which provides a coherent algorithm framework of training, inferring, and evaluation for the popular rotated object detection algorithm based on deep learning. MMRotate implements 18 state-of-the-art algorithms and supports the three most frequently used angle definition methods. To facilitate future research and industrial applications of rotated object detection-related problems, we also provide a large number of trained models and detailed benchmarks to give insights into the performance of rotated object detection. MMRotate is publicly released at https://github.com/open-mmlab/mmrotate.

* 5 pages, 2 tables, MMRotate is publicly released at https://github.com/open-mmlab/mmrotate 
Viaarxiv icon

Dense Label Encoding for Boundary Discontinuity Free Rotation Detection

Nov 19, 2020
Xue Yang, Liping Hou, Yue Zhou, Wentao Wang, Junchi Yan

Figure 1 for Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
Figure 2 for Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
Figure 3 for Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
Figure 4 for Dense Label Encoding for Boundary Discontinuity Free Rotation Detection

Rotation detection serves as a fundamental building block in many visual applications involving aerial image, scene text, and face etc. Differing from the dominant regression-based approaches for orientation estimation, this paper explores a relatively less-studied methodology based on classification. The hope is to inherently dismiss the boundary discontinuity issue as encountered by the regression-based detectors. We propose new techniques to push its frontier in two aspects: i) new encoding mechanism: the design of two Densely Coded Labels (DCL) for angle classification, to replace the Sparsely Coded Label (SCL) in existing classification-based detectors, leading to three times training speed increase as empirically observed across benchmarks, further with notable improvement in detection accuracy; ii) loss re-weighting: we propose Angle Distance and Aspect Ratio Sensitive Weighting (ADARSW), which improves the detection accuracy especially for square-like objects, by making DCL-based detectors sensitive to angular distance and object's aspect ratio. Extensive experiments and visual analysis on large-scale public datasets for aerial images i.e. DOTA, UCAS-AOD, HRSC2016, as well as scene text dataset ICDAR2015 and MLT, show the effectiveness of our approach. The source code is available at https://github.com/Thinklab-SJTU/DCL_RetinaNet_Tensorflow and is also integrated in our open source rotation detection benchmark: https://github.com/yangxue0827/RotationDetection.

* 12 pages, 6 figures, 8 tables 
Viaarxiv icon