Alert button
Picture for John McDonald

John McDonald

Alert button

Fast and Efficient Scene Categorization for Autonomous Driving using VAEs

Oct 26, 2022
Saravanabalagi Ramachandran, Jonathan Horgan, Ganesh Sistu, John McDonald

Figure 1 for Fast and Efficient Scene Categorization for Autonomous Driving using VAEs
Figure 2 for Fast and Efficient Scene Categorization for Autonomous Driving using VAEs
Figure 3 for Fast and Efficient Scene Categorization for Autonomous Driving using VAEs
Figure 4 for Fast and Efficient Scene Categorization for Autonomous Driving using VAEs

Scene categorization is a useful precursor task that provides prior knowledge for many advanced computer vision tasks with a broad range of applications in content-based image indexing and retrieval systems. Despite the success of data driven approaches in the field of computer vision such as object detection, semantic segmentation, etc., their application in learning high-level features for scene recognition has not achieved the same level of success. We propose to generate a fast and efficient intermediate interpretable generalized global descriptor that captures coarse features from the image and use a classification head to map the descriptors to 3 scene categories: Rural, Urban and Suburban. We train a Variational Autoencoder in an unsupervised manner and map images to a constrained multi-dimensional latent space and use the latent vectors as compact embeddings that serve as global descriptors for images. The experimental results evidence that the VAE latent vectors capture coarse information from the image, supporting their usage as global descriptors. The proposed global descriptor is very compact with an embedding length of 128, significantly faster to compute, and is robust to seasonal and illuminational changes, while capturing sufficient scene information required for scene categorization.

* The 24th Irish Machine Vision and Image Processing Conference (IMVIP), 2022, 9-16  
* Published in the 24th Irish Machine Vision and Image Processing Conference (IMVIP 2022) 
Viaarxiv icon

GP-net: Grasp Proposal for Mobile Manipulators

Sep 21, 2022
Anna Konrad, John McDonald, Rudi Villing

Figure 1 for GP-net: Grasp Proposal for Mobile Manipulators
Figure 2 for GP-net: Grasp Proposal for Mobile Manipulators
Figure 3 for GP-net: Grasp Proposal for Mobile Manipulators
Figure 4 for GP-net: Grasp Proposal for Mobile Manipulators

We present the Grasp Proposal Network (GP-net), a Convolutional Neural Network model which can generate 6-DOF grasps for mobile manipulators. To train GP-net, we synthetically generate a dataset containing depth-images and ground-truth grasp information for more than 1400 objects. In real-world experiments we use the EGAD! grasping benchmark to evaluate GP-net against two commonly used algorithms, the Volumetric Grasping Network (VGN) and the Grasp Pose Detection package (GPD), on a PAL TIAGo mobile manipulator. GP-net achieves grasp success rates of 82.2% compared to 57.8% for VGN and 63.3% with GPD. In contrast to the state-of-the-art methods in robotic grasping, GP-net can be used out-of-the-box for grasping objects with mobile manipulators without limiting the workspace, requiring table segmentation or needing a high-end GPU. To encourage the usage of GP-net, we provide a ROS package along with our code and pre-trained models at https://aucoroboticsmu.github.io/GP-net/.

Viaarxiv icon

Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge

Jun 26, 2022
Saravanabalagi Ramachandran, Ganesh Sistu, Varun Ravi Kumar, John McDonald, Senthil Yogamani

Figure 1 for Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Figure 2 for Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Figure 3 for Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Figure 4 for Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge

Object detection is a comprehensively studied problem in autonomous driving. However, it has been relatively less explored in the case of fisheye cameras. The strong radial distortion breaks the translation invariance inductive bias of Convolutional Neural Networks. Thus, we present the WoodScape fisheye object detection challenge for autonomous driving which was held as part of the CVPR 2022 Workshop on Omnidirectional Computer Vision (OmniCV). This is one of the first competitions focused on fisheye camera object detection. We encouraged the participants to design models which work natively on fisheye images without rectification. We used CodaLab to host the competition based on the publicly available WoodScape fisheye dataset. In this paper, we provide a detailed analysis on the competition which attracted the participation of 120 global teams and a total of 1492 submissions. We briefly discuss the details of the winning methods and analyze their qualitative and quantitative results.

* Workshop on Omnidirectional Computer Vision (OmniCV) at Conference on Computer Vision and Pattern Recognition (CVPR) 2021 
Viaarxiv icon

ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

May 31, 2022
Pramit Dutta, Ganesh Sistu, Senthil Yogamani, Edgar Galván, John McDonald

Figure 1 for ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Figure 2 for ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Figure 3 for ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Figure 4 for ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream tasks. Current state-of-the art approaches to generate BEV-maps employ a Convolutional Neural Network (CNN) backbone to create feature-maps which are passed through a spatial transformer to project the derived features onto the BEV coordinate frame. In this paper, we evaluate the use of vision transformers (ViT) as a backbone architecture to generate BEV maps. Our network architecture, ViT-BEVSeg, employs standard vision transformers to generate a multi-scale representation of the input image. The resulting representation is then provided as an input to a spatial transformer decoder module which outputs segmentation maps in the BEV grid. We evaluate our approach on the nuScenes dataset demonstrating a considerable improvement in the performance relative to state-of-the-art approaches.

* Accepted for 2022 IEEE World Congress on Computational Intelligence (Track: IJCNN) 
Viaarxiv icon

2.5D Vehicle Odometry Estimation

Nov 16, 2021
Ciaran Eising, Leroy-Francisco Pereira, Jonathan Horgan, Anbuchezhiyan Selvaraju, John McDonald, Paul Moran

Figure 1 for 2.5D Vehicle Odometry Estimation
Figure 2 for 2.5D Vehicle Odometry Estimation
Figure 3 for 2.5D Vehicle Odometry Estimation
Figure 4 for 2.5D Vehicle Odometry Estimation

It is well understood that in ADAS applications, a good estimate of the pose of the vehicle is required. This paper proposes a metaphorically named 2.5D odometry, whereby the planar odometry derived from the yaw rate sensor and four wheel speed sensors is augmented by a linear model of suspension. While the core of the planar odometry is a yaw rate model that is already understood in the literature, we augment this by fitting a quadratic to the incoming signals, enabling interpolation, extrapolation, and a finer integration of the vehicle position. We show, by experimental results with a DGPS/IMU reference, that this model provides highly accurate odometry estimates, compared with existing methods. Utilising sensors that return the change in height of vehicle reference points with changing suspension configurations, we define a planar model of the vehicle suspension, thus augmenting the odometry model. We present an experimental framework and evaluations criteria by which the goodness of the odometry is evaluated and compared with existing methods. This odometry model has been designed to support low-speed surround-view camera systems that are well-known. Thus, we present some application results that show a performance boost for viewing and computer vision applications using the proposed odometry

* IET Intelligent Transport Systems, 2020  
* 13 pages, 16 figures, 2 tables 
Viaarxiv icon

Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge

Jul 17, 2021
Saravanabalagi Ramachandran, Ganesh Sistu, John McDonald, Senthil Yogamani

Figure 1 for Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge
Figure 2 for Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge
Figure 3 for Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge
Figure 4 for Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge

We present the WoodScape fisheye semantic segmentation challenge for autonomous driving which was held as part of the CVPR 2021 Workshop on Omnidirectional Computer Vision (OmniCV). This challenge is one of the first opportunities for the research community to evaluate the semantic segmentation techniques targeted for fisheye camera perception. Due to strong radial distortion standard models don't generalize well to fisheye images and hence the deformations in the visual appearance of objects and entities needs to be encoded implicitly or as explicit knowledge. This challenge served as a medium to investigate the challenges and new methodologies to handle the complexities with perception on fisheye images. The challenge was hosted on CodaLab and used the recently released WoodScape dataset comprising of 10k samples. In this paper, we provide a summary of the competition which attracted the participation of 71 global teams and a total of 395 submissions. The top teams recorded significantly improved mean IoU and accuracy scores over the baseline PSPNet with ResNet-50 backbone. We summarize the methods of winning algorithms and analyze the failure cases. We conclude by providing future directions for the research.

* Workshop on Omnidirectional Computer Vision (OmniCV) at Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Presentation video is available at https://youtu.be/xa7Fl2mD4CA?t=12253 
Viaarxiv icon

OdoViz: A 3D Odometry Visualization and Processing Tool

Jul 15, 2021
Saravanabalagi Ramachandran, John McDonald

Figure 1 for OdoViz: A 3D Odometry Visualization and Processing Tool
Figure 2 for OdoViz: A 3D Odometry Visualization and Processing Tool
Figure 3 for OdoViz: A 3D Odometry Visualization and Processing Tool
Figure 4 for OdoViz: A 3D Odometry Visualization and Processing Tool

OdoViz is a reactive web-based tool for 3D visualization and processing of autonomous vehicle datasets designed to support common tasks in visual place recognition research. The system includes functionality for loading, inspecting, visualizing, and processing GPS/INS poses, point clouds and camera images. It supports a number of commonly used driving datasets and can be adapted to load custom datasets with minimal effort. OdoViz's design consists of a slim server to serve the datasets coupled with a rich client frontend. This design supports multiple deployment configurations including single user stand-alone installations, research group installations serving datasets internally across a lab, or publicly accessible web-frontends for providing online interfaces for exploring and interacting with datasets. The tool allows viewing complete vehicle trajectories traversed at multiple different time periods simultaneously, facilitating tasks such as sub-sampling, comparing and finding pose correspondences both across and within sequences. This significantly reduces the effort required in creating subsets of data from existing datasets for machine learning tasks. Further to the above, the system also supports adding custom extensions and plugins to extend the capabilities of the software for other potential data management, visualization and processing tasks. The platform has been open-sourced to promote its use and encourage further contributions from the research community.

* Accepted, ITSC 2021 
Viaarxiv icon

A 2.5D Vehicle Odometry Estimation for Vision Applications

May 06, 2021
Paul Moran, Leroy-Francisco Periera, Anbuchezhiyan Selvaraju, Tejash Prakash, Pantelis Ermilios, John McDonald, Jonathan Horgan, Ciarán Eising

Figure 1 for A 2.5D Vehicle Odometry Estimation for Vision Applications
Figure 2 for A 2.5D Vehicle Odometry Estimation for Vision Applications
Figure 3 for A 2.5D Vehicle Odometry Estimation for Vision Applications
Figure 4 for A 2.5D Vehicle Odometry Estimation for Vision Applications

This paper proposes a method to estimate the pose of a sensor mounted on a vehicle as the vehicle moves through the world, an important topic for autonomous driving systems. Based on a set of commonly deployed vehicular odometric sensors, with outputs available on automotive communication buses (e.g. CAN or FlexRay), we describe a set of steps to combine a planar odometry based on wheel sensors with a suspension model based on linear suspension sensors. The aim is to determine a more accurate estimate of the camera pose. We outline its usage for applications in both visualisation and computer vision.

* Proceedings of the 2020 Irish Machine Vision and Image Processing Conference  
Viaarxiv icon

Vision-based Driver Assistance Systems: Survey, Taxonomy and Advances

Apr 26, 2021
Jonathan Horgan, Ciarán Hughes, John McDonald, Senthil Yogamani

Vision-based driver assistance systems is one of the rapidly growing research areas of ITS, due to various factors such as the increased level of safety requirements in automotive, computational power in embedded systems, and desire to get closer to autonomous driving. It is a cross disciplinary area encompassing specialised fields like computer vision, machine learning, robotic navigation, embedded systems, automotive electronics and safety critical software. In this paper, we survey the list of vision based advanced driver assistance systems with a consistent terminology and propose a taxonomy. We also propose an abstract model in an attempt to formalize a top-down view of application development to scale towards autonomous driving system.

* 2015 IEEE 18th International Conference on Intelligent Transportation Systems  
Viaarxiv icon

Computer vision in automated parking systems: Design, implementation and challenges

Apr 26, 2021
Markus Heimberger, Jonathan Horgan, Ciaran Hughes, John McDonald, Senthil Yogamani

Figure 1 for Computer vision in automated parking systems: Design, implementation and challenges
Figure 2 for Computer vision in automated parking systems: Design, implementation and challenges
Figure 3 for Computer vision in automated parking systems: Design, implementation and challenges
Figure 4 for Computer vision in automated parking systems: Design, implementation and challenges

Automated driving is an active area of research in both industry and academia. Automated Parking, which is automated driving in a restricted scenario of parking with low speed manoeuvring, is a key enabling product for fully autonomous driving systems. It is also an important milestone from the perspective of a higher end system built from the previous generation driver assistance systems comprising of collision warning, pedestrian detection, etc. In this paper, we discuss the design and implementation of an automated parking system from the perspective of computer vision algorithms. Designing a low-cost system with functional safety is challenging and leads to a large gap between the prototype and the end product, in order to handle all the corner cases. We demonstrate how camera systems are crucial for addressing a range of automated parking use cases and also, to add robustness to systems based on active distance measuring sensors, such as ultrasonics and radar. The key vision modules which realize the parking use cases are 3D reconstruction, parking slot marking recognition, freespace and vehicle/pedestrian detection. We detail the important parking use cases and demonstrate how to combine the vision modules to form a robust parking system. To the best of the authors' knowledge, this is the first detailed discussion of a systemic view of a commercial automated parking system.

* Image and Vision Computing, Volume 68, December 2017, Pages 88-101  
Viaarxiv icon