Region Proposal Network (RPN) is the cornerstone of two-stage object detectors, it generates a sparse set of object proposals and alleviates the extrem foregroundbackground class imbalance problem during training. However, we find that the potential of the detector has not been fully exploited due to the IoU distribution imbalance and inadequate quantity of the training samples generated by RPN. With the increasing intersection over union (IoU), the exponentially smaller numbers of positive samples would lead to the distribution skewed towards lower IoUs, which hinders the optimization of detector at high IoU levels. In this paper, to break through the limitations of RPN, we propose IoU-Uniform R-CNN, a simple but effective method that directly generates training samples with uniform IoU distribution for the regression branch as well as the IoU prediction branch. Besides, we improve the performance of IoU prediction branch by eliminating the feature offsets of RoIs at inference, which helps the NMS procedure by preserving accurately localized bounding box. Extensive experiments on the PASCAL VOC and MS COCO dataset show the effectiveness of our method, as well as its compatibility and adaptivity to many object detection architectures. The code is made publicly available at https://github.com/zl1994/IoU-Uniform-R-CNN,
Channel pruning is one of the important methods for deep model compression. Most of existing pruning methods mainly focus on classification. Few of them conduct systematic research on object detection. However, object detection is different from classification, which requires not only semantic information but also localization information. In this paper, based on DCP \cite{zhuang2018discrimination} which is state-of-the-art pruning method for classification, we propose a localization-aware auxiliary network to find out the channels with key information for classification and regression so that we can conduct channel pruning directly for object detection, which saves lots of time and computing resources. In order to capture the localization information, we first design the auxiliary network with a contextual ROIAlign layer which can obtain precise localization information of the default boxes by pixel alignment and enlarges the receptive fields of the default boxes when pruning shallow layers. Then, we construct a loss function for object detection task which tends to keep the channels that contain the key information for classification and regression. Extensive experiments demonstrate the effectiveness of our method. On MS COCO, we prune 70\% parameters of the SSD based on ResNet-50 with modest accuracy drop, which outperforms the-state-of-art method.