Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Harshith Mohan Kumar

Fusing Pseudo Labels with Weak Supervision for Dynamic Traffic Scenarios

Aug 30, 2023

Harshith Mohan Kumar, Sean Lawrence

Abstract:Advanced Driver Assistance Systems (ADAS) have made significant strides, capitalizing on computer vision to enhance perception and decision-making capabilities. Nonetheless, the adaptation of these systems to diverse traffic scenarios poses challenges due to shifts in data distribution stemming from factors such as location, weather, and road infrastructure. To tackle this, we introduce a weakly-supervised label unification pipeline that amalgamates pseudo labels from a multitude of object detection models trained on heterogeneous datasets. Our pipeline engenders a unified label space through the amalgamation of labels from disparate datasets, rectifying bias and enhancing generalization. We fine-tune multiple object detection models on individual datasets, subsequently crafting a unified dataset featuring pseudo labels, meticulously validated for precision. Following this, we retrain a solitary object detection model using the merged label space, culminating in a resilient model proficient in dynamic traffic scenarios. We put forth a comprehensive evaluation of our approach, employing diverse datasets originating from varied Asian countries, effectively demonstrating its efficacy in challenging road conditions. Notably, our method yields substantial enhancements in object detection performance, culminating in a model with heightened resistance against domain shifts.

* This work was accepted as an extended abstract at the International Conference on Computer Vision (ICCV) 2023 BRAVO Workshop, Paris, France

Via

Access Paper or Ask Questions

OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Jul 20, 2023

Aditya Nalgunda Ganesh, Dhruval Pobbathi Badrinath, Harshith Mohan Kumar, Priya SS, Surabhi Narayan

Figure 1 for OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Figure 2 for OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Figure 3 for OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Figure 4 for OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Abstract:Modern approaches for vision-centric environment perception for autonomous navigation make extensive use of self-supervised monocular depth estimation algorithms that output disparity maps. However, when this disparity map is projected onto 3D space, the errors in disparity are magnified, resulting in a depth estimation error that increases quadratically as the distance from the camera increases. Though Light Detection and Ranging (LiDAR) can solve this issue, it is expensive and not feasible for many applications. To address the challenge of accurate ranging with low-cost sensors, we propose, OCTraN, a transformer architecture that uses iterative-attention to convert 2D image features into 3D occupancy features and makes use of convolution and transpose convolution to efficiently operate on spatial information. We also develop a self-supervised training pipeline to generalize the model to any scene by eliminating the need for LiDAR ground truth by substituting it with pseudo-ground truth labels obtained from boosted monocular depth estimation.

* This work was accepted as a spotlight presentation at the Transformers for Vision Workshop @CVPR 2023

Via

Access Paper or Ask Questions