Abstract:Context bias refers to the association between the foreground objects and background during the object detection training process. Various methods have been proposed to minimize the context bias when applying the trained model to an unseen domain, known as domain adaptation for object detection (DAOD). But a principled approach to understand why the context bias occurs and how to remove it has been missing. In this work, we provide a causal view of the context bias, pointing towards the pooling operation in the convolution network architecture as the possible source of this bias. We present an alternative, Mask Pooling, which uses an additional input of foreground masks, to separate the pooling process in the respective foreground and background regions and show that this process leads the trained model to detect objects in a more robust manner under different domains. We also provide a benchmark designed to create an ultimate test for DAOD, using foregrounds in the presence of absolute random backgrounds, to analyze the robustness of the intended trained models. Through these experiments, we hope to provide a principled approach for minimizing context bias under domain shift.
Abstract:We examine the problem of estimating footprint uncertainty of objects imaged using the infrastructure based camera sensing. A closed form relationship is established between the ground coordinates and the sources of the camera errors. Using the error propagation equation, the covariance of a given ground coordinate can be measured as a function of the camera errors. The uncertainty of the footprint of the bounding box can then be given as the function of all the extreme points of the object footprint. In order to calculate the uncertainty of a ground point, the typical error sizes of the error sources are required. We present a method of estimating the typical error sizes from an experiment using a static, high-precision LiDAR as the ground truth. Finally, we present a simulated case study of uncertainty quantification from infrastructure based camera in CARLA to provide a sense of how the uncertainty changes across a left turn maneuver.