Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Object Detection": models, code, and papers

GiraffeDet: A Heavy-Neck Paradigm for Object Detection

Feb 09, 2022
Yiqi Jiang, Zhiyu Tan, Junyan Wang, Xiuyu Sun, Ming Lin, Hao Li

In conventional object detection frameworks, a backbone body inherited from image recognition models extracts deep latent features and then a neck module fuses these latent features to capture information at different scales. As the resolution in object detection is much larger than in image recognition, the computational cost of the backbone often dominates the total inference cost. This heavy-backbone design paradigm is mostly due to the historical legacy when transferring image recognition models to object detection rather than an end-to-end optimized design for object detection. In this work, we show that such paradigm indeed leads to sub-optimal object detection models. To this end, we propose a novel heavy-neck paradigm, GiraffeDet, a giraffe-like network for efficient object detection. The GiraffeDet uses an extremely lightweight backbone and a very deep and large neck module which encourages dense information exchange among different spatial scales as well as different levels of latent semantics simultaneously. This design paradigm allows detectors to process the high-level semantic information and low-level spatial information at the same priority even in the early stage of the network, making it more effective in detection tasks. Numerical evaluations on multiple popular object detection benchmarks show that GiraffeDet consistently outperforms previous SOTA models across a wide spectrum of resource constraints.

  
Access Paper or Ask Questions

Small Object Detection Based on Modified FSSD and Model Compression

Aug 24, 2021
Qingcai Wang, Hao Zhang, Xianggong Hong, Qinqin Zhou

Small objects have relatively low resolution, the unobvious visual features which are difficult to be extracted, so the existing object detection methods cannot effectively detect small objects, and the detection speed and stability are poor. Thus, this paper proposes a small object detection algorithm based on FSSD, meanwhile, in order to reduce the computational cost and storage space, pruning is carried out to achieve model compression. Firstly, the semantic information contained in the features of different layers can be used to detect different scale objects, and the feature fusion method is improved to obtain more information beneficial to small objects; secondly, batch normalization layer is introduced to accelerate the training of neural network and make the model sparse; finally, the model is pruned by scaling factor to get the corresponding compressed model. The experimental results show that the average accuracy (mAP) of the algorithm can reach 80.4% on PASCAL VOC and the speed is 59.5 FPS on GTX1080ti. After pruning, the compressed model can reach 79.9% mAP, and 79.5 FPS in detection speed. On MS COCO, the best detection accuracy (APs) is 12.1%, and the overall detection accuracy is 49.8% AP when IoU is 0.5. The algorithm can not only improve the detection accuracy of small objects, but also greatly improves the detection speed, which reaches a balance between speed and accuracy.

* IEEE ICSIP,2021 
  
Access Paper or Ask Questions

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Jan 27, 2018
Gui-Song Xia, Xiang Bai, Jian Ding, Zhen Zhu, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect $2806$ aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using $15$ common object categories. The fully annotated DOTA images contains $188,282$ instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.

  
Access Paper or Ask Questions

Revisiting 3D Object Detection From an Egocentric Perspective

Dec 14, 2021
Boyang Deng, Charles R. Qi, Mahyar Najibi, Thomas Funkhouser, Yin Zhou, Dragomir Anguelov

3D object detection is a key module for safety-critical robotics applications such as autonomous driving. For these applications, we care most about how the detections affect the ego-agent's behavior and safety (the egocentric perspective). Intuitively, we seek more accurate descriptions of object geometry when it's more likely to interfere with the ego-agent's motion trajectory. However, current detection metrics, based on box Intersection-over-Union (IoU), are object-centric and aren't designed to capture the spatio-temporal relationship between objects and the ego-agent. To address this issue, we propose a new egocentric measure to evaluate 3D object detection, namely Support Distance Error (SDE). Our analysis based on SDE reveals that the egocentric detection quality is bounded by the coarse geometry of the bounding boxes. Given the insight that SDE would benefit from more accurate geometry descriptions, we propose to represent objects as amodal contours, specifically amodal star-shaped polygons, and devise a simple model, StarPoly, to predict such contours. Our experiments on the large-scale Waymo Open Dataset show that SDE better reflects the impact of detection quality on the ego-agent's safety compared to IoU; and the estimated contours from StarPoly consistently improve the egocentric detection quality over recent 3D object detectors.

* Published in NeurIPS 2021 
  
Access Paper or Ask Questions

Model Adaption Object Detection System for Robot

Nov 07, 2019
Jingwen Fu, Licheng Zong, Yinbing Li, Bingqian Yang, Ke Li, Xibei Liu

How to detect the object and guide the robot to get close to the object is an important task for autonomous robots. The main difficulties here is that the view of the robot changes a lot when it moves and there are limited data available to train. To tackle these challenges, we propose a novel vision system for the robot, the model adaption object detection system. Instead of using one object detection neural network to solve all the problem, we use different object detection neural network to guide the robot according to the situation the robot is in, by using a meta neural network to allocate the object detection neural network. Furthermore, we use the transfer learning technology and depthwise separable convolutions, so that our model is easy to train and can address small dataset problem.

  
Access Paper or Ask Questions

Multi-scale Volumes for Deep Object Detection and Localization

Jul 26, 2016
Eshed Ohn-Bar, M. M. Trivedi

This study aims to analyze the benefits of improved multi-scale reasoning for object detection and localization with deep convolutional neural networks. To that end, an efficient and general object detection framework which operates on scale volumes of a deep feature pyramid is proposed. In contrast to the proposed approach, most current state-of-the-art object detectors operate on a single-scale in training, while testing involves independent evaluation across scales. One benefit of the proposed approach is in better capturing of multi-scale contextual information, resulting in significant gains in both detection performance and localization quality of objects on the PASCAL VOC dataset and a multi-view highway vehicles dataset. The joint detection and localization scale-specific models are shown to especially benefit detection of challenging object categories which exhibit large scale variation as well as detection of small objects.

* To appear in Pattern Recognition 2016 
  
Access Paper or Ask Questions

DFAM-DETR: Deformable feature based attention mechanism DETR on slender object detection

Apr 22, 2022
Wen Feng, Wang Mei, Hu Xiaojie

Object detection is one of the most significant aspects of computer vision, and it has achieved substantial results in a variety of domains. It is worth noting that there are few studies focusing on slender object detection. CNNs are widely employed in object detection, however it performs poorly on slender object detection due to the fixed geometric structure and sampling points. In comparison, Deformable DETR has the ability to obtain global to specific features. Even though it outperforms the CNNs in slender objects detection accuracy and efficiency, the results are still not satisfactory. Therefore, we propose Deformable Feature based Attention Mechanism (DFAM) to increase the slender object detection accuracy and efficiency of Deformable DETR. The DFAM has adaptive sampling points of deformable convolution and attention mechanism that aggregate information from the entire input sequence in the backbone network. This improved detector is named as Deformable Feature based Attention Mechanism DETR (DFAM- DETR). Results indicate that DFAM-DETR achieves outstanding detection performance on slender objects.

* 9 pages, 7 figures 
  
Access Paper or Ask Questions

Drone Object Detection Using RGB/IR Fusion

Jan 11, 2022
Lizhi Yang, Ruhang Ma, Avideh Zakhor

Object detection using aerial drone imagery has received a great deal of attention in recent years. While visible light images are adequate for detecting objects in most scenarios, thermal cameras can extend the capabilities of object detection to night-time or occluded objects. As such, RGB and Infrared (IR) fusion methods for object detection are useful and important. One of the biggest challenges in applying deep learning methods to RGB/IR object detection is the lack of available training data for drone IR imagery, especially at night. In this paper, we develop several strategies for creating synthetic IR images using the AIRSim simulation engine and CycleGAN. Furthermore, we utilize an illumination-aware fusion framework to fuse RGB and IR images for object detection on the ground. We characterize and test our methods for both simulated and actual data. Our solution is implemented on an NVIDIA Jetson Xavier running on an actual drone, requiring about 28 milliseconds of processing per RGB/IR image pair.

* Accepted to Electronic Imaging Symposium, Computational Imaging XX Conference, 2022 
  
Access Paper or Ask Questions

Efficient Incremental Learning for Mobile Object Detection

Mar 26, 2019
Dawei Li, Serafettin Tasci, Shalini Ghosh, Jingwen Zhu, Junting Zhang, Larry Heck

Object detection models shipped with camera-equipped mobile devices cannot cover the objects of interest for every user. Therefore, the incremental learning capability is a critical feature for a robust and personalized mobile object detection system that many applications would rely on. In this paper, we present an efficient yet practical system, IMOD, to incrementally train an existing object detection model such that it can detect new object classes without losing its capability to detect old classes. The key component of IMOD is a novel incremental learning algorithm that trains end-to-end for one-stage object detection deep models only using training data of new object classes. Specifically, to avoid catastrophic forgetting, the algorithm distills three types of knowledge from the old model to mimic the old model's behavior on object classification, bounding box regression and feature extraction. In addition, since the training data for the new classes may not be available, a real-time dataset construction pipeline is designed to collect training images on-the-fly and automatically label the images with both category and bounding box annotations. We have implemented IMOD under both mobile-cloud and mobile-only setups. Experiment results show that the proposed system can learn to detect a new object class in just a few minutes, including both dataset construction and model training. In comparison, traditional fine-tuning based method may take a few hours for training, and in most cases would also need a tedious and costly manual dataset labeling step.

  
Access Paper or Ask Questions

Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection

Feb 15, 2022
Fatih Cagatay Akyon, Sinan Onur Altinuc, Alptekin Temizel

Detection of small objects and objects far away in the scene is a major challenge in surveillance applications. Such objects are represented by small number of pixels in the image and lack sufficient details, making them difficult to detect using conventional detectors. In this work, an open-source framework called Slicing Aided Hyper Inference (SAHI) is proposed that provides a generic slicing aided inference and fine-tuning pipeline for small object detection. The proposed technique is generic in the sense that it can be applied on top of any available object detector without any fine-tuning. Experimental evaluations, using object detection baselines on the Visdrone and xView aerial object detection datasets show that the proposed inference method can increase object detection AP by 6.8%, 5.1% and 5.3% for FCOS, VFNet and TOOD detectors, respectively. Moreover, the detection accuracy can be further increased with a slicing aided fine-tuning, resulting in a cumulative increase of 12.7%, 13.4% and 14.5% AP in the same order. Proposed technique has been integrated with Detectron2, MMDetection and YOLOv5 models and it is publicly available at https://github.com/obss/sahi.git .

* Submitted to ICIP 2022, 5 pages, 4 figures, 2 tables 
  
Access Paper or Ask Questions
<<
6
7
8
9
10
11
12
13
14
15
16
17
18
>>