Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"autonomous cars": models, code, and papers

OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving

Feb 15, 2021
Varun Ravi Kumar, Senthil Yogamani, Hazem Rashed, Ganesh Sitsu, Christian Witt, Isabelle Leang, Stefan Milz, Patrick Mäder

Surround View fisheye cameras are commonly deployed in automated driving for 360\deg{} near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model performs better than the respective single task versions. Our multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. We propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. This was crucial to enable training on the WoodScape dataset, comprised of data from different parts of the world collected by 12 different cameras mounted on three different cars with different intrinsics and viewpoints. Given that bounding boxes is not a good representation for distorted fisheye images, we also extend object detection to use a polygon with non-uniformly sampled vertices. We additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes. We obtain the state-of-the-art results on KITTI for depth estimation and pose estimation tasks and competitive performance on the other tasks. We perform extensive ablation studies on various architecture choices and task weighting methodologies. A short video at provides qualitative results.

* Camera ready version accepted for RA-L and ICRA 2021 publication 
Access Paper or Ask Questions

Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars

Dec 04, 2017
Mark Martinez, Chawin Sitawarin, Kevin Finch, Lennart Meincke, Alex Yablonski, Alain Kornhauser

As an initial assessment, over 480,000 labeled virtual images of normal highway driving were readily generated in Grand Theft Auto V's virtual environment. Using these images, a CNN was trained to detect following distance to cars/objects ahead, lane markings, and driving angle (angular heading relative to lane centerline): all variables necessary for basic autonomous driving. Encouraging results were obtained when tested on over 50,000 labeled virtual images from substantially different GTA-V driving environments. This initial assessment begins to define both the range and scope of the labeled images needed for training as well as the range and scope of labeled images needed for testing the definition of boundaries and limitations of trained networks. It is the efficacy and flexibility of a "GTA-V"-like virtual environment that is expected to provide an efficient well-defined foundation for the training and testing of Convolutional Neural Networks for safe driving. Additionally, described is the Princeton Virtual Environment (PVE) for the training, testing and enhancement of safe driving AI, which is being developed using the video-game engine Unity. PVE is being developed to recreate rare but critical corner cases that can be used in re-training and enhancing machine learning models and understanding the limitations of current self driving models. The Florida Tesla crash is being used as an initial reference.

* 15 pages, 4 figures, under review by TRB 2018 Annual Meeting 
Access Paper or Ask Questions

Maintaining driver attentiveness in shared-control autonomous driving

Feb 05, 2021
Radu Calinescu, Naif Alasmari, Mario Gleirscher

We present a work-in-progress approach to improving driver attentiveness in cars provided with automated driving systems. The approach is based on a control loop that monitors the driver's biometrics (eye movement, heart rate, etc.) and the state of the car; analyses the driver's attentiveness level using a deep neural network; plans driver alerts and changes in the speed of the car using a formally verified controller; and executes this plan using actuators ranging from acoustic and visual to haptic devices. The paper presents (i) the self-adaptive system formed by this monitor-analyse-plan-execute (MAPE) control loop, the car and the monitored driver, and (ii) the use of probabilistic model checking to synthesise the controller for the planning step of the MAPE loop.

* 7 pages, 6 figures 
Access Paper or Ask Questions

Applying Semantic Segmentation to Autonomous Cars in the Snowy Environment

Jul 25, 2020
Zhaoyu Pan, Takanori Emaru, Ankit Ravankar, Yukinori Kobayashi

This paper mainly focuses on environment perception in snowy situations which forms the backbone of the autonomous driving technology. For the purpose, semantic segmentation is employed to classify the objects while the vehicle is driven autonomously. We train the Fully Convolutional Networks (FCN) on our own dataset and present the experimental results. Finally, the outcomes are analyzed to give a conclusion. It can be concluded that the database still needs to be optimized and a favorable algorithm should be proposed to get better results.

* 36th Annual Conference of the Robot Society of Japan, Nagoya, 2018 
* 4 pages, 5 Figures 
Access Paper or Ask Questions

Proximally Optimal Predictive Control Algorithm for Path Tracking of Self-Driving Cars

Mar 24, 2021
Chinmay Vilas Samak, Tanmay Vilas Samak, Sivanathan Kandhasamy

This work presents proximally optimal predictive control algorithm, which is essentially a model-based lateral controller for steered autonomous vehicles that selects an optimal steering command within the neighborhood of previous steering angle based on the predicted vehicle location. The proposed algorithm was formulated with an aim of overcoming the limitations associated with the existing control laws for autonomous steering - namely PID, Pure-Pursuit and Stanley controllers. Particularly, our approach was aimed at bridging the gap between tracking efficiency and computational cost, thereby ensuring effective path tracking in real-time. The effectiveness of our approach was investigated through a series of dynamic simulation experiments pertaining to autonomous path tracking, employing an adaptive control law for longitudinal motion control of the vehicle. We measured the latency of the proposed algorithm in order to comment on its real-time factor and validated our approach by comparing it against the established control laws in terms of both crosstrack and heading errors recorded throughout the respective path tracking simulations.

Access Paper or Ask Questions

Automatic Estimation of Sphere Centers from Images of Calibrated Cameras

Feb 24, 2020
Levente Hajder, Tekla Tóth, Zoltán Pusztai

Calibration of devices with different modalities is a key problem in robotic vision. Regular spatial objects, such as planes, are frequently used for this task. This paper deals with the automatic detection of ellipses in camera images, as well as to estimate the 3D position of the spheres corresponding to the detected 2D ellipses. We propose two novel methods to (i) detect an ellipse in camera images and (ii) estimate the spatial location of the corresponding sphere if its size is known. The algorithms are tested both quantitatively and qualitatively. They are applied for calibrating the sensor system of autonomous cars equipped with digital cameras, depth sensors and LiDAR devices.

Access Paper or Ask Questions

Model-Based Task Transfer Learning

Mar 16, 2019
Charlott Vallon, Francesco, Borrelli

A model-based task transfer learning (MBTTL) method is presented. We consider a constrained nonlinear dynamical system and assume that a dataset of state and input pairs that solve a task T1 is available. Our objective is to find a feasible state-feedback policy for a second task, T1, by using stored data from T2. Our approach applies to tasks T2 which are composed of the same subtasks as T1, but in different order. In this paper we formally introduce the definition of subtask, the MBTTL problem and provide examples of MBTTL in the fields of autonomous cars and manipulators. Then, a computationally efficient approach to solve the MBTTL problem is presented along with proofs of feasibility for constrained linear dynamical systems. Simulation results show the effectiveness of the proposed method.

* 9 pages, 3 figures 
Access Paper or Ask Questions

Compound Multi-branch Feature Fusion for Real Image Restoration

Jun 02, 2022
Chi-Mao Fan, Tsung-Jung Liu, Kuan-Hsien Liu

Image restoration is a challenging and ill-posed problem which also has been a long-standing issue. However, most of learning based restoration methods are proposed to target one degradation type which means they are lack of generalization. In this paper, we proposed a multi-branch restoration model inspired from the Human Visual System (i.e., Retinal Ganglion Cells) which can achieve multiple restoration tasks in a general framework. The experiments show that the proposed multi-branch architecture, called CMFNet, has competitive performance results on four datasets, including image dehazing, deraindrop, and deblurring, which are very common applications for autonomous cars. The source code and pretrained models of three restoration tasks are available at

Access Paper or Ask Questions

A Lane Merge Coordination Model for a V2X Scenario

Oct 20, 2020
Luis Sequeira, Adam Szefer, Jamie Slome, Toktam Mahmoodi

Cooperative driving using connectivity services has been a promising avenue for autonomous vehicles, with the low latency and further reliability support provided by 5th Generation Mobile Network (5G). In this paper, we present an application for lane merge coordination based on a centralised system, for connected cars. This application delivers trajectory recommendations to the connected vehicles on the road. The application comprises of a Traffic Orchestrator as the main component. We apply machine learning and data analysis to predict whether a connected vehicle can successfully complete the cooperative manoeuvre of a lane merge. Furthermore, the acceleration and heading parameters that are necessary for the completion of a safe merge are elaborated. The results demonstrate the performance of several existing algorithms and how their main parameters were selected to avoid over-fitting.

* 2019 European Conference on Networks and Communications (EuCNC) 
Access Paper or Ask Questions

On Encoding Temporal Evolution for Real-time Action Prediction

Feb 08, 2018
Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis

Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames. To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames. We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI. We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.

* Submitted Version 
Access Paper or Ask Questions