Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"autonomous cars": models, code, and papers

Meta Learning MPC using Finite-Dimensional Gaussian Process Approximations

Aug 13, 2020
Elena Arcari, Andrea Carron, Melanie N. Zeilinger

Data availability has dramatically increased in recent years, driving model-based control methods to exploit learning techniques for improving the system description, and thus control performance. Two key factors that hinder the practical applicability of learning methods in control are their high computational complexity and limited generalization capabilities to unseen conditions. Meta-learning is a powerful tool that enables efficient learning across a finite set of related tasks, easing adaptation to new unseen tasks. This paper makes use of a meta-learning approach for adaptive model predictive control, by learning a system model that leverages data from previous related tasks, while enabling fast fine-tuning to the current task during closed-loop operation. The dynamics is modeled via Gaussian process regression and, building on the Karhunen-Lo{\`e}ve expansion, can be approximately reformulated as a finite linear combination of kernel eigenfunctions. Using data collected over a set of tasks, the eigenfunction hyperparameters are optimized in a meta-training phase by maximizing a variational bound for the log-marginal likelihood. During meta-testing, the eigenfunctions are fixed, so that only the linear parameters are adapted to the new unseen task in an online adaptive fashion via Bayesian linear regression, providing a simple and efficient inference scheme. Simulation results are provided for autonomous racing with miniature race cars adapting to unseen road conditions.

Access Paper or Ask Questions

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Sep 24, 2018
Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross

Lidar based 3D object detection is inevitable for autonomous driving, because it directly links to environmental understanding and therefore builds the base for prediction and motion planning. The capacity of inferencing highly sparse 3D data in real-time is an ill-posed problem for lots of other application areas besides automated vehicles, e.g. augmented reality, personal robotics or industrial automation. We introduce Complex-YOLO, a state of the art real-time 3D object detection network on point clouds only. In this work, we describe a network that expands YOLOv2, a fast 2D standard object detector for RGB images, by a specific complex regression strategy to estimate multi-class 3D boxes in Cartesian space. Thus, we propose a specific Euler-Region-Proposal Network (E-RPN) to estimate the pose of the object by adding an imaginary and a real fraction to the regression network. This ends up in a closed complex space and avoids singularities, which occur by single angle estimations. The E-RPN supports to generalize well during training. Our experiments on the KITTI benchmark suite show that we outperform current leading methods for 3D object detection specifically in terms of efficiency. We achieve state of the art results for cars, pedestrians and cyclists by being more than five times faster than the fastest competitor. Further, our model is capable of estimating all eight KITTI-classes, including Vans, Trucks or sitting pedestrians simultaneously with high accuracy.

Access Paper or Ask Questions

Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges

Sep 19, 2017
Muhammad Usama, Junaid Qadir, Aunn Raza, Hunain Arif, Kok-Lim Alvin Yau, Yehia Elkhatib, Amir Hussain, Ala Al-Fuqaha

While machine learning and artificial intelligence have long been applied in networking research, the bulk of such works has focused on supervised learning. Recently there has been a rising trend of employing unsupervised machine learning using unstructured raw network data to improve network performance and provide services such as traffic engineering, anomaly detection, Internet traffic classification, and quality of service optimization. The interest in applying unsupervised learning techniques in networking emerges from their great success in other fields such as computer vision, natural language processing, speech recognition, and optimal control (e.g., for developing autonomous self-driving cars). Unsupervised learning is interesting since it can unconstrain us from the need of labeled data and manual handcrafted feature engineering thereby facilitating flexible, general, and automated methods of machine learning. The focus of this survey paper is to provide an overview of the applications of unsupervised learning in the domain of networking. We provide a comprehensive survey highlighting the recent advancements in unsupervised learning techniques and describe their applications for various learning tasks in the context of networking. We also provide a discussion on future directions and open research issues, while also identifying potential pitfalls. While a few survey papers focusing on the applications of machine learning in networking have previously been published, a survey of similar scope and breadth is missing in literature. Through this paper, we advance the state of knowledge by carefully synthesizing the insights from these survey papers while also providing contemporary coverage of recent advances.

Access Paper or Ask Questions

SRVIO: Super Robust Visual Inertial Odometry for dynamic environments and challenging Loop-closure conditions

Jan 14, 2022
Ali Samadzadeh, Ahmad Nickabadi

The visual localization or odometry problem is a well-known challenge in the field of autonomous robots and cars. Traditionally, this problem can ba tackled with the help of expensive sensors such as lidars. Nowadays, the leading research is on robust localization using economic sensors, such as cameras and IMUs. The geometric methods based on these sensors are pretty good in normal conditions withstable lighting and no dynamic objects. These methods suffer from significant loss and divergence in such challenging environments. The scientists came to use deep neural networks (DNNs) as the savior to mitigate this problem. The main idea behind using DNNs was to better understand the problem inside the data and overcome complex conditions (such as a dynamic object in front of the camera, extreme lighting conditions, keeping the track at high speeds, etc.) The prior endto-end DNN methods are able to overcome some of the mentioned challenges. However, no general and robust framework for all of these scenarios is available. In this paper, we have combined geometric and DNN based methods to have the pros of geometric SLAM frameworks and overcome the remaining challenges with the DNNs help. To do this, we have modified the Vins-Mono framework (the most robust and accurate framework till now) and we were able to achieve state-of-the-art results on TUM-Dynamic, TUM-VI, ADVIO and EuRoC datasets compared to geometric and end-to-end DNN based SLAMs. Our proposed framework was also able to achieve acceptable results on extreme simulated cases resembling the challenges mentioned earlier easy.

* 11 pages, 7 figures 
Access Paper or Ask Questions

CT-ICP: Real-time Elastic LiDAR Odometry with Loop Closure

Sep 27, 2021
Pierre Dellenbach, Jean-Emmanuel Deschaud, Bastien Jacquet, François Goulette

Multi-beam LiDAR sensors are increasingly used in robotics, particularly for autonomous cars for localization and perception tasks. However, perception is closely linked to the localization task and the robot's ability to build a fine map of its environment. For this, we propose a new real-time LiDAR odometry method called CT-ICP, as well as a complete SLAM with loop closure. The principle of CT-ICP is to use an elastic formulation of the trajectory, with a continuity of poses intra-scan and discontinuity between scans, to be more robust to high frequencies in the movements of the sensor. The registration is based on scan-to-map with a dense point cloud as map structured in sparse voxels to operate in real time. At the same time, a fast method of loop closure detection using elevation images and an optimization of poses by graph allows to obtain a complete SLAM purely on LiDAR. To show the robustness of the method, we tested it on seven datasets: KITTI, KITTI-raw, KITTI-360, KITTI-CARLA, ParisLuco, Newer College, and NCLT in driving and high-frequency motion scenarios. The CT-ICP odometry is implemented in C++ and available online. The loop detection and pose graph optimization is in the framework pyLiDAR-SLAM in Python and also available online. CT-ICP is currently first, among those giving access to a public code, on the KITTI odometry leaderboard, with an average Relative Translation Error (RTE) of 0.59% and an average time per scan of 60ms on a CPU with a single thread.

* 6 pages 
Access Paper or Ask Questions

RestoreX-AI: A Contrastive Approach towards Guiding Image Restoration via Explainable AI Systems

Apr 03, 2022
Aboli Marathe, Pushkar Jain, Rahee Walambe, Ketan Kotecha

Modern applications such as self-driving cars and drones rely heavily upon robust object detection techniques. However, weather corruptions can hinder the object detectability and pose a serious threat to their navigation and reliability. Thus, there is a need for efficient denoising, deraining, and restoration techniques. Generative adversarial networks and transformers have been widely adopted for image restoration. However, the training of these methods is often unstable and time-consuming. Furthermore, when used for object detection (OD), the output images generated by these methods may provide unsatisfactory results despite image clarity. In this work, we propose a contrastive approach towards mitigating this problem, by evaluating images generated by restoration models during and post training. This approach leverages OD scores combined with attention maps for predicting the usefulness of restored images for the OD task. We conduct experiments using two novel use-cases of conditional GANs and two transformer methods that probe the robustness of the proposed approach on multi-weather corruptions in the OD task. Our approach achieves an averaged 178 percent increase in mAP between the input and restored images under adverse weather conditions like dust tornadoes and snowfall. We report unique cases where greater denoising does not improve OD performance and conversely where noisy generated images demonstrate good results. We conclude the need for explainability frameworks to bridge the gap between human and machine perception, especially in the context of robust object detection for autonomous vehicles.

* Under review 
Access Paper or Ask Questions

ABMOF: A Novel Optical Flow Algorithm for Dynamic Vision Sensors

May 10, 2018
Min Liu, Tobi Delbruck

Dynamic Vision Sensors (DVS), which output asynchronous log intensity change events, have potential applications in high-speed robotics, autonomous cars and drones. The precise event timing, sparse output, and wide dynamic range of the events are well suited for optical flow, but conventional optical flow (OF) algorithms are not well matched to the event stream data. This paper proposes an event-driven OF algorithm called adaptive block-matching optical flow (ABMOF). ABMOF uses time slices of accumulated DVS events. The time slices are adaptively rotated based on the input events and OF results. Compared with other methods such as gradient-based OF, ABMOF can efficiently be implemented in compact logic circuits. Results show that ABMOF achieves comparable accuracy to conventional standards such as Lucas-Kanade (LK). The main contributions of our paper are new adaptive time-slice rotation methods that ensure the generated slices have sufficient features for matching,including a feedback mechanism that controls the generated slices to have average slice displacement within the block search range. An LK method using our adapted slices is also implemented. The ABMOF accuracy is compared with this LK method on natural scene data including sparse and dense texture, high dynamic range, and fast motion exceeding 30,000 pixels per second.The paper dataset and source code are available from

* 11 pages, 10 figures, Video of result: 
Access Paper or Ask Questions

All-Weather Object Recognition Using Radar and Infrared Sensing

Oct 30, 2020
Marcel Sheeny

Autonomous cars are an emergent technology which has the capacity to change human lives. The current sensor systems which are most capable of perception are based on optical sensors. For example, deep neural networks show outstanding results in recognising objects when used to process data from cameras and Light Detection And Ranging (LiDAR) sensors. However these sensors perform poorly under adverse weather conditions such as rain, fog, and snow due to the sensor wavelengths. This thesis explores new sensing developments based on long wave polarised infrared (IR) imagery and imaging radar to recognise objects. First, we developed a methodology based on Stokes parameters using polarised infrared data to recognise vehicles using deep neural networks. Second, we explored the potential of using only the power spectrum captured by low-THz radar sensors to perform object recognition in a controlled scenario. This latter work is based on a data-driven approach together with the development of a data augmentation method based on attenuation, range and speckle noise. Last, we created a new large-scale dataset in the "wild" with many different weather scenarios (sunny, overcast, night, fog, rain and snow) showing radar robustness to detect vehicles in adverse weather. High resolution radar and polarised IR imagery, combined with a deep learning approach, are shown as a potential alternative to current automotive sensing systems based on visible spectrum optical technology as they are more robust in severe weather and adverse light conditions.

* 201 pages, PhD Thesis, Heriot-Watt University, October 2020. Includes results from arXiv:1804.02576, arXiv:1912.03157, and arXiv:2010.09076 
Access Paper or Ask Questions

Depth-aware Glass Surface Detection with Cross-modal Context Mining

Jun 22, 2022
Jiaying Lin, Yuen Hei Yeung, Rynson W. H. Lau

Glass surfaces are becoming increasingly ubiquitous as modern buildings tend to use a lot of glass panels. This however poses substantial challenges on the operations of autonomous systems such as robots, self-driving cars and drones, as the glass panels can become transparent obstacles to the navigation.Existing works attempt to exploit various cues, including glass boundary context or reflections, as a prior. However, they are all based on input RGB images.We observe that the transmission of 3D depth sensor light through glass surfaces often produces blank regions in the depth maps, which can offer additional insights to complement the RGB image features for glass surface detection. In this paper, we propose a novel framework for glass surface detection by incorporating RGB-D information, with two novel modules: (1) a cross-modal context mining (CCM) module to adaptively learn individual and mutual context features from RGB and depth information, and (2) a depth-missing aware attention (DAA) module to explicitly exploit spatial locations where missing depths occur to help detect the presence of glass surfaces. In addition, we propose a large-scale RGB-D glass surface detection dataset, called \textit{RGB-D GSD}, for RGB-D glass surface detection. Our dataset comprises 3,009 real-world RGB-D glass surface images with precise annotations. Extensive experimental results show that our proposed model outperforms state-of-the-art methods.

Access Paper or Ask Questions

Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images

May 27, 2022
Zhi Tian, Xiangxiang Chu, Xiaoming Wang, Xiaolin Wei, Chunhua Shen

We present a simple yet effective fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes, termed FCOS-LiDAR. Unlike the dominant methods that use the bird-eye view (BEV), our proposed detector detects objects from the range view (RV, a.k.a. range image) of the LiDAR points. Due to the range view's compactness and compatibility with the LiDAR sensors' sampling process on self-driving cars, the range view-based object detector can be realized by solely exploiting the vanilla 2D convolutions, departing from the BEV-based methods which often involve complicated voxelization operations and sparse convolutions. For the first time, we show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors while being significantly faster and simpler. More importantly, almost all previous range view-based detectors only focus on single-frame point clouds, since it is challenging to fuse multi-frame point clouds into a single range view. In this work, we tackle this challenging issue with a novel range view projection mechanism, and for the first time demonstrate the benefits of fusing multi-frame point clouds for a range-view based detector. Extensive experiments on nuScenes show the superiority of our proposed method and we believe that our work can be strong evidence that an RV-based 3D detector can compare favourably with the current mainstream BEV-based detectors.

* 12 pages 
Access Paper or Ask Questions