Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"autonomous cars": models, code, and papers

Automatic Estimation of Sphere Centers from Images of Calibrated Cameras

Feb 24, 2020
Levente Hajder, Tekla Tóth, Zoltán Pusztai

Calibration of devices with different modalities is a key problem in robotic vision. Regular spatial objects, such as planes, are frequently used for this task. This paper deals with the automatic detection of ellipses in camera images, as well as to estimate the 3D position of the spheres corresponding to the detected 2D ellipses. We propose two novel methods to (i) detect an ellipse in camera images and (ii) estimate the spatial location of the corresponding sphere if its size is known. The algorithms are tested both quantitatively and qualitatively. They are applied for calibrating the sensor system of autonomous cars equipped with digital cameras, depth sensors and LiDAR devices.


Model-Based Task Transfer Learning

Mar 16, 2019
Charlott Vallon, Francesco, Borrelli

A model-based task transfer learning (MBTTL) method is presented. We consider a constrained nonlinear dynamical system and assume that a dataset of state and input pairs that solve a task T1 is available. Our objective is to find a feasible state-feedback policy for a second task, T1, by using stored data from T2. Our approach applies to tasks T2 which are composed of the same subtasks as T1, but in different order. In this paper we formally introduce the definition of subtask, the MBTTL problem and provide examples of MBTTL in the fields of autonomous cars and manipulators. Then, a computationally efficient approach to solve the MBTTL problem is presented along with proofs of feasibility for constrained linear dynamical systems. Simulation results show the effectiveness of the proposed method.

* 9 pages, 3 figures 

Compound Multi-branch Feature Fusion for Real Image Restoration

Jun 02, 2022
Chi-Mao Fan, Tsung-Jung Liu, Kuan-Hsien Liu

Image restoration is a challenging and ill-posed problem which also has been a long-standing issue. However, most of learning based restoration methods are proposed to target one degradation type which means they are lack of generalization. In this paper, we proposed a multi-branch restoration model inspired from the Human Visual System (i.e., Retinal Ganglion Cells) which can achieve multiple restoration tasks in a general framework. The experiments show that the proposed multi-branch architecture, called CMFNet, has competitive performance results on four datasets, including image dehazing, deraindrop, and deblurring, which are very common applications for autonomous cars. The source code and pretrained models of three restoration tasks are available at


A Lane Merge Coordination Model for a V2X Scenario

Oct 20, 2020
Luis Sequeira, Adam Szefer, Jamie Slome, Toktam Mahmoodi

Cooperative driving using connectivity services has been a promising avenue for autonomous vehicles, with the low latency and further reliability support provided by 5th Generation Mobile Network (5G). In this paper, we present an application for lane merge coordination based on a centralised system, for connected cars. This application delivers trajectory recommendations to the connected vehicles on the road. The application comprises of a Traffic Orchestrator as the main component. We apply machine learning and data analysis to predict whether a connected vehicle can successfully complete the cooperative manoeuvre of a lane merge. Furthermore, the acceleration and heading parameters that are necessary for the completion of a safe merge are elaborated. The results demonstrate the performance of several existing algorithms and how their main parameters were selected to avoid over-fitting.

* 2019 European Conference on Networks and Communications (EuCNC) 

On Encoding Temporal Evolution for Real-time Action Prediction

Feb 08, 2018
Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis

Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames. To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames. We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI. We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.

* Submitted Version 

Massif: Interactive Interpretation of Adversarial Attacks on Deep Learning

Feb 16, 2020
Nilaksh Das, Haekyu Park, Zijie J. Wang, Fred Hohman, Robert Firstman, Emily Rogers, Duen Horng Chau

Deep neural networks (DNNs) are increasingly powering high-stakes applications such as autonomous cars and healthcare; however, DNNs are often treated as "black boxes" in such applications. Recent research has also revealed that DNNs are highly vulnerable to adversarial attacks, raising serious concerns over deploying DNNs in the real world. To overcome these deficiencies, we are developing Massif, an interactive tool for deciphering adversarial attacks. Massif identifies and interactively visualizes neurons and their connections inside a DNN that are strongly activated or suppressed by an adversarial attack. Massif provides both a high-level, interpretable overview of the effect of an attack on a DNN, and a low-level, detailed description of the affected neurons. These tightly coupled views in Massif help people better understand which input features are most vulnerable or important for correct predictions.

* Appear in ACM Conference on Human Factors in Computing Systems (CHI) Late-Breaking Works 2020, 7 pages 

End to End Vehicle Lateral Control Using a Single Fisheye Camera

Aug 20, 2018
Marin Toromanoff, Emilie Wirbel, Frédéric Wilhelm, Camilo Vejarano, Xavier Perrotton, Fabien Moutarde

Convolutional neural networks are commonly used to control the steering angle for autonomous cars. Most of the time, multiple long range cameras are used to generate lateral failure cases. In this paper we present a novel model to generate this data and label augmentation using only one short range fisheye camera. We present our simulator and how it can be used as a consistent metric for lateral end-to-end control evaluation. Experiments are conducted on a custom dataset corresponding to more than 10000 km and 200 hours of open road driving. Finally we evaluate this model on real world driving scenarios, open road and a custom test track with challenging obstacle avoidance and sharp turns. In our simulator based on real-world videos, the final model was capable of more than 99% autonomy on urban road

* 7 pages paper accepted at IROS 2018 

Motion Classification and Height Estimation of Pedestrians Using Sparse Radar Data

Mar 03, 2021
Markus Horn, Ole Schumann, Markus Hahn, Jürgen Dickmann, Klaus Dietmayer

A complete overview of the surrounding vehicle environment is important for driver assistance systems and highly autonomous driving. Fusing results of multiple sensor types like camera, radar and lidar is crucial for increasing the robustness. The detection and classification of objects like cars, bicycles or pedestrians has been analyzed in the past for many sensor types. Beyond that, it is also helpful to refine these classes and distinguish for example between different pedestrian types or activities. This task is usually performed on camera data, though recent developments are based on radar spectrograms. However, for most automotive radar systems, it is only possible to obtain radar targets instead of the original spectrograms. This work demonstrates that it is possible to estimate the body height of walking pedestrians using 2D radar targets. Furthermore, different pedestrian motion types are classified.

* 2018 Sensor Data Fusion: Trends, Solutions, Applications (SDF) 
* 6 pages, 6 figures, 1 table 

It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction

Apr 04, 2020
Karttikeya Mangalam, Harshayu Girase, Shreyas Agarwal, Kuan-Hui Lee, Ehsan Adeli, Jitendra Malik, Adrien Gaidon

Human trajectory forecasting with multiple socially interacting agents is of critical importance for autonomous navigation in human environments, e.g., for self-driving cars and social robots. In this work, we present Predicted Endpoint Conditioned Network (PECNet) for flexible human trajectory prediction. PECNet infers distant trajectory endpoints to assist in long-range multi-modal trajectory prediction. A novel non-local social pooling layer enables PECNet to infer diverse yet socially compliant trajectories. Additionally, we present a simple "truncation-trick" for improving few-shot multi-modal trajectory prediction performance. We show that PECNet improves state-of-the-art performance on the Stanford Drone trajectory prediction benchmark by ~19.5% and on the ETH/UCY benchmark by ~40.8%.

* 14 pages, 6 figures, 3 tables