Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Object Detection": models, code, and papers

RRPN: Radar Region Proposal Network for Object Detection in Autonomous Vehicles

May 15, 2019
Ramin Nabati, Hairong Qi

Region proposal algorithms play an important role in most state-of-the-art two-stage object detection networks by hypothesizing object locations in the image. Nonetheless, region proposal algorithms are known to be the bottleneck in most two-stage object detection networks, increasing the processing time for each image and resulting in slow networks not suitable for real-time applications such as autonomous driving vehicles. In this paper we introduce RRPN, a Radar-based real-time region proposal algorithm for object detection in autonomous driving vehicles. RRPN generates object proposals by mapping Radar detections to the image coordinate system and generating pre-defined anchor boxes for each mapped Radar detection point. These anchor boxes are then transformed and scaled based on the object's distance from the vehicle, to provide more accurate proposals for the detected objects. We evaluate our method on the newly released NuScenes dataset [1] using the Fast R-CNN object detection network [2]. Compared to the Selective Search object proposal algorithm [3], our model operates more than 100x faster while at the same time achieves higher detection precision and recall. Code has been made publicly available at .

* To appear in ICIP 2019 
Access Paper or Ask Questions

Boosting Camouflaged Object Detection with Dual-Task Interactive Transformer

May 21, 2022
Zhengyi Liu, Zhili Zhang, Wei Wu

Camouflaged object detection intends to discover the concealed objects hidden in the surroundings. Existing methods follow the bio-inspired framework, which first locates the object and second refines the boundary. We argue that the discovery of camouflaged objects depends on the recurrent search for the object and the boundary. The recurrent processing makes the human tired and helpless, but it is just the advantage of the transformer with global search ability. Therefore, a dual-task interactive transformer is proposed to detect both accurate position of the camouflaged object and its detailed boundary. The boundary feature is considered as Query to improve the camouflaged object detection, and meanwhile the object feature is considered as Query to improve the boundary detection. The camouflaged object detection and the boundary detection are fully interacted by multi-head self-attention. Besides, to obtain the initial object feature and boundary feature, transformer-based backbones are adopted to extract the foreground and background. The foreground is just object, while foreground minus background is considered as boundary. Here, the boundary feature can be obtained from blurry boundary region of the foreground and background. Supervised by the object, the background and the boundary ground truth, the proposed model achieves state-of-the-art performance in public datasets.

* ICPR2022 
* Accepted by ICPR2022 
Access Paper or Ask Questions

Few-shot Object Detection on Remote Sensing Images

Jun 16, 2020
Jingyu Deng, Xiang Li, Yi Fang

In this paper, we deal with the problem of object detection on remote sensing images. Previous methods have developed numerous deep CNN-based methods for object detection on remote sensing images and the report remarkable achievements in detection performance and efficiency. However, current CNN-based methods mostly require a large number of annotated samples to train deep neural networks and tend to have limited generalization abilities for unseen object categories. In this paper, we introduce a few-shot learning-based method for object detection on remote sensing images where only a few annotated samples are provided for the unseen object categories. More specifically, our model contains three main components: a meta feature extractor that learns to extract feature representations from input images, a reweighting module that learn to adaptively assign different weights for each feature representation from the support images, and a bounding box prediction module that carries out object detection on the reweighted feature maps. We build our few-shot object detection model upon YOLOv3 architecture and develop a multi-scale object detection framework. Experiments on two benchmark datasets demonstrate that with only a few annotated samples our model can still achieve a satisfying detection performance on remote sensing images and the performance of our model is significantly better than the well-established baseline models.

* 12pages, 7 figures 
Access Paper or Ask Questions

Modular network for high accuracy object detection

Jan 24, 2020
Erez Yahalomi

We present a novel modular object detection convolutional neural network that significantly improves the accuracy of computer vision object detection. The network consists of two stages in a hierarchical structure. The first stage is a network that detects general classes. The second stage consists of separate networks to refine the classification and localization of each of the general classes objects. Compared to a state of the art object detection networks the classification error in the modular network is improved by approximately 3-5 times, from 12 percent to 2.5-4.5 percent. The modular network achieved a very high score in object detection of 0.94 mAP. The network is easy to implement, it can be a platform to improve the accuracy of widespread state of the art object detection networks and other kinds of deep learning networks.

* 8 pages 
Access Paper or Ask Questions

Broad Area Search and Detection of Surface-to-Air Missile Sites Using Spatial Fusion of Component Object Detections from Deep Neural Networks

Mar 23, 2020
Alan B. Cannaday II, Curt H. Davis, Grant J. Scott, Blake Ruprecht, Derek T. Anderson

Here we demonstrate how Deep Neural Network (DNN) detections of multiple constitutive or component objects that are part of a larger, more complex, and encompassing feature can be spatially fused to improve the search, detection, and retrieval (ranking) of the larger complex feature. First, scores computed from a spatial clustering algorithm are normalized to a reference space so that they are independent of image resolution and DNN input chip size. Then, multi-scale DNN detections from various component objects are fused to improve the detection and retrieval of DNN detections of a larger complex feature. We demonstrate the utility of this approach for broad area search and detection of Surface-to-Air Missile (SAM) sites that have a very low occurrence rate (only 16 sites) over a ~90,000 km^2 study area in SE China. The results demonstrate that spatial fusion of multi-scale component-object DNN detections can reduce the detection error rate of SAM Sites by $>$85% while still maintaining a 100% recall. The novel spatial fusion approach demonstrated here can be easily extended to a wide variety of other challenging object search and detection problems in large-scale remote sensing image datasets.

* 9 pages, 9 figures, 9 tables, pre-published expansion of IGARSS2019 conference paper "Improved Search and Detection of Surface-to-Air Missile Sites Using Spatial Fusion of Component Object Detections from Deep Neural Networks" 
Access Paper or Ask Questions

Graph Fusion Network for Multi-Oriented Object Detection

May 07, 2022
Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou, Xu-Cheng Yin

In object detection, non-maximum suppression (NMS) methods are extensively adopted to remove horizontal duplicates of detected dense boxes for generating final object instances. However, due to the degraded quality of dense detection boxes and not explicit exploration of the context information, existing NMS methods via simple intersection-over-union (IoU) metrics tend to underperform on multi-oriented and long-size objects detection. Distinguishing with general NMS methods via duplicate removal, we propose a novel graph fusion network, named GFNet, for multi-oriented object detection. Our GFNet is extensible and adaptively fuse dense detection boxes to detect more accurate and holistic multi-oriented object instances. Specifically, we first adopt a locality-aware clustering algorithm to group dense detection boxes into different clusters. We will construct an instance sub-graph for the detection boxes belonging to one cluster. Then, we propose a graph-based fusion network via Graph Convolutional Network (GCN) to learn to reason and fuse the detection boxes for generating final instance boxes. Extensive experiments both on public available multi-oriented text datasets (including MSRA-TD500, ICDAR2015, ICDAR2017-MLT) and multi-oriented object datasets (DOTA) verify the effectiveness and robustness of our method against general NMS methods in multi-oriented object detection.

* Accepted by Applied Intelligence (APIN 2022) 
Access Paper or Ask Questions

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation

Mar 04, 2022
Hamidreza Fazlali, Yixuan Xu, Yuan Ren, Bingbing Liu

3D object detection using LiDAR data is an indispensable component for autonomous driving systems. Yet, only a few LiDAR-based 3D object detection methods leverage segmentation information to further guide the detection process. In this paper, we propose a novel multi-task framework that jointly performs 3D object detection and panoptic segmentation. In our method, the 3D object detection backbone in Bird's-Eye-View (BEV) plane is augmented by the injection of Range-View (RV) feature maps from the 3D panoptic segmentation backbone. This enables the detection backbone to leverage multi-view information to address the shortcomings of each projection view. Furthermore, foreground semantic information is incorporated to ease the detection task by highlighting the locations of each object class in the feature maps. Finally, a new center density heatmap generated based on the instance-level information further guides the detection backbone by suggesting possible box center locations for objects. Our method works with any BEV-based 3D object detection method, and as shown by extensive experiments on the nuScenes dataset, it provides significant performance gains. Notably, the proposed method based on a single-stage CenterPoint 3D object detection network achieved state-of-the-art performance on nuScenes 3D Detection Benchmark with 67.3 NDS.

* Accepted to CVPR 2022 
Access Paper or Ask Questions

Hyperspectral Image Dataset for Benchmarking on Salient Object Detection

Jul 02, 2018
Nevrez Imamoglu, Yu Oishi, Xiaoqiang Zhang, Guanqun Ding, Yuming Fang, Toru Kouyama, Ryosuke Nakamura

Many works have been done on salient object detection using supervised or unsupervised approaches on colour images. Recently, a few studies demonstrated that efficient salient object detection can also be implemented by using spectral features in visible spectrum of hyperspectral images from natural scenes. However, these models on hyperspectral salient object detection were tested with a very few number of data selected from various online public dataset, which are not specifically created for object detection purposes. Therefore, here, we aim to contribute to the field by releasing a hyperspectral salient object detection dataset with a collection of 60 hyperspectral images with their respective ground-truth binary images and representative rendered colour images (sRGB). We took several aspects in consideration during the data collection such as variation in object size, number of objects, foreground-background contrast, object position on the image, and etc. Then, we prepared ground truth binary images for each hyperspectral data, where salient objects are labelled on the images. Finally, we did performance evaluation using Area Under Curve (AUC) metric on some existing hyperspectral saliency detection models in literature.

* 3 pages, 3 figures. 2 tables, appeared in the Proceedings of the 10th International Conference on Quality of Multimedia Experience (QoMEX 2018) 
Access Paper or Ask Questions

Objectness Scoring and Detection Proposals in Forward-Looking Sonar Images with Convolutional Neural Networks

Sep 08, 2017
Matias Valdenegro-Toro

Forward-looking sonar can capture high resolution images of underwater scenes, but their interpretation is complex. Generic object detection in such images has not been solved, specially in cases of small and unknown objects. In comparison, detection proposal algorithms have produced top performing object detectors in real-world color images. In this work we develop a Convolutional Neural Network that can reliably score objectness of image windows in forward-looking sonar images and by thresholding objectness, we generate detection proposals. In our dataset of marine garbage objects, we obtain 94% recall, generating around 60 proposals per image. The biggest strength of our method is that it can generalize to previously unseen objects. We show this by detecting chain links, walls and a wrench without previous training in such objects. We strongly believe our method can be used for class-independent object detection, with many real-world applications such as chain following and mine detection.

* Proceedings of ANNPR 2016 
* Author version 
Access Paper or Ask Questions

Gliding vertex on the horizontal bounding box for multi-oriented object detection

Nov 21, 2019
Yongchao Xu, Mingtao Fu, Qimeng Wang, Yukang Wang, Kai Chen, Gui-Song Xia, Xiang Bai

Object detection has recently experienced substantial progress. Yet, the widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented objects such as objects in aerial images and scene texts. In this paper, we propose a simple yet effective framework to detect multi-oriented objects. Instead of directly regressing the four vertices, we glide the vertex of the horizontal bounding box on each corresponding side to accurately describe a multi-oriented object. Specifically, We regress four length ratios characterizing the relative gliding offset on each corresponding side. This may facilitate the offset learning and avoid the confusion issue of sequential label points for oriented objects. To further remedy the confusion issue for nearly horizontal objects, we also introduce an obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object. We add these five extra target variables to the regression head of fast R-CNN, which requires ignorable extra computation time. Extensive experimental results demonstrate that without bells and whistles, the proposed method achieves superior performances on multiple multi-oriented object detection benchmarks including object detection in aerial images, scene text detection, pedestrian detection in fisheye images.

Access Paper or Ask Questions