Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Nov 18, 2021

Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong

Figure 1 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Figure 2 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Figure 3 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Figure 4 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Share this with someone who'll enjoy it:

Abstract:Despite great progress in object detection, most existing methods are limited to a small set of object categories, due to the tremendous human effort needed for instance-level bounding-box annotation. To alleviate the problem, recent open vocabulary and zero-shot detection methods attempt to detect object categories not seen during training. However, these approaches still rely on manually provided bounding-box annotations on a set of base classes. We propose an open vocabulary detection framework that can be trained without manually provided bounding-box annotations. Our method achieves this by leveraging the localization ability of pre-trained vision-language models and generating pseudo bounding-box labels that can be used directly for training object detectors. Experimental results on COCO, PASCAL VOC, Objects365 and LVIS demonstrate the effectiveness of our method. Specifically, our method outperforms the state-of-the-arts (SOTA) that are trained using human annotated bounding-boxes by 3% AP on COCO novel categories even though our training source is not equipped with manual bounding-box labels. When utilizing the manual bounding-box labels as our baselines do, our method surpasses the SOTA largely by 8% AP.

View paper on

Share this with someone who'll enjoy it:

Title:Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Paper and Code