Alert button
Picture for Shreyas S. Shivakumar

Shreyas S. Shivakumar

Alert button

Any Way You Look At It: Semantic Crossview Localization and Mapping with LiDAR

Mar 16, 2022
Ian D. Miller, Anthony Cowley, Ravi Konkimalla, Shreyas S. Shivakumar, Ty Nguyen, Trey Smith, Camillo Jose Taylor, Vijay Kumar

Figure 1 for Any Way You Look At It: Semantic Crossview Localization and Mapping with LiDAR
Figure 2 for Any Way You Look At It: Semantic Crossview Localization and Mapping with LiDAR
Figure 3 for Any Way You Look At It: Semantic Crossview Localization and Mapping with LiDAR
Figure 4 for Any Way You Look At It: Semantic Crossview Localization and Mapping with LiDAR

Currently, GPS is by far the most popular global localization method. However, it is not always reliable or accurate in all environments. SLAM methods enable local state estimation but provide no means of registering the local map to a global one, which can be important for inter-robot collaboration or human interaction. In this work, we present a real-time method for utilizing semantics to globally localize a robot using only egocentric 3D semantically labelled LiDAR and IMU as well as top-down RGB images obtained from satellites or aerial robots. Additionally, as it runs, our method builds a globally registered, semantic map of the environment. We validate our method on KITTI as well as our own challenging datasets, and show better than 10 meter accuracy, a high degree of robustness, and the ability to estimate the scale of a top-down map on the fly if it is initially unknown.

* in IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2397-2404, April 2021  
* Published in the IEEE Robotics and Automation Letters and presented at the IEEE 2021 International Conference on Robotics and Automation. See https://www.youtube.com/watch?v=_qwAoYK9iGU for accompanying video 
Viaarxiv icon

DSOL: A Fast Direct Sparse Odometry Scheme

Mar 15, 2022
Chao Qu, Shreyas S. Shivakumar, Ian D. Miller, Camillo J. Taylor

Figure 1 for DSOL: A Fast Direct Sparse Odometry Scheme
Figure 2 for DSOL: A Fast Direct Sparse Odometry Scheme
Figure 3 for DSOL: A Fast Direct Sparse Odometry Scheme
Figure 4 for DSOL: A Fast Direct Sparse Odometry Scheme

In this paper, we describe Direct Sparse Odometry Lite (DSOL), an improved version of Direct Sparse Odometry (DSO). We propose several algorithmic and implementation enhancements which speed up computation by a significant factor (on average 5x) even on resource constrained platforms. The increase in speed allows us to process images at higher frame rates, which in turn provides better results on rapid motions. Our open-source implementation is available at https://github.com/versatran01/dsol.

Viaarxiv icon

LLOL: Low-Latency Odometry for Spinning Lidars

Oct 04, 2021
Chao Qu, Shreyas S. Shivakumar, Wenxin Liu, Camillo J. Taylor

Figure 1 for LLOL: Low-Latency Odometry for Spinning Lidars
Figure 2 for LLOL: Low-Latency Odometry for Spinning Lidars
Figure 3 for LLOL: Low-Latency Odometry for Spinning Lidars
Figure 4 for LLOL: Low-Latency Odometry for Spinning Lidars

In this paper, we present a low-latency odometry system designed for spinning lidars. Many existing lidar odometry methods wait for an entire sweep from the lidar before processing the data. This introduces a large delay between the first laser firing and its pose estimate. To reduce this latency, we treat the spinning lidar as a streaming sensor and process packets as they arrive. This effectively distributes expensive operations across time, resulting in a very fast and lightweight system with much higher throughput and lower latency. Our open-source implementation is available at \url{https://github.com/versatran01/llol}.

Viaarxiv icon

Mine Tunnel Exploration using Multiple Quadrupedal Robots

Sep 20, 2019
Ian D. Miller, Fernando Cladera, Anthony Cowley, Shreyas S. Shivakumar, Elijah S. Lee, Laura Jarin-Lipschitz, Akhilesh Bhat, Neil Rodrigues, Alex Zhou, Avraham Cohen, Adarsh Kulkarni, James Laney, Camillo Jose Taylor, Vijay Kumar

Figure 1 for Mine Tunnel Exploration using Multiple Quadrupedal Robots
Figure 2 for Mine Tunnel Exploration using Multiple Quadrupedal Robots
Figure 3 for Mine Tunnel Exploration using Multiple Quadrupedal Robots
Figure 4 for Mine Tunnel Exploration using Multiple Quadrupedal Robots

Robotic exploration of underground environments is a particularly challenging problem due to communication, endurance, and traversability constraints which necessitate high degrees of autonomy and agility. These challenges are further enhanced by the need to minimize human intervention for practical applications. While legged robots have the ability to traverse extremely challenging terrain, they also engender further inherent challenges for planning, estimation, and control. In this work, we describe a fully autonomous system for multi-robot mine exploration and mapping using legged quadrupeds, as well as a distributed database mesh networking system for reporting data. In addition, we show results from the DARPA Subterranean Challenge (SubT) Tunnel Circuit demonstrating localization of artifacts after traversals of hundreds of meters. To our knowledge, these experiments represent the first fully autonomous exploration of an unknown GNSS-denied environment undertaken by legged robots.

* Accompanying video: https://www.youtube.com/watch?v=jGXuOCHKC8E 
Viaarxiv icon

PST900: RGB-Thermal Calibration, Dataset and Segmentation Network

Sep 20, 2019
Shreyas S. Shivakumar, Neil Rodrigues, Alex Zhou, Ian D. Miller, Vijay Kumar, Camillo J. Taylor

Figure 1 for PST900: RGB-Thermal Calibration, Dataset and Segmentation Network
Figure 2 for PST900: RGB-Thermal Calibration, Dataset and Segmentation Network
Figure 3 for PST900: RGB-Thermal Calibration, Dataset and Segmentation Network
Figure 4 for PST900: RGB-Thermal Calibration, Dataset and Segmentation Network

In this work we propose long wave infrared (LWIR) imagery as a viable supporting modality for semantic segmentation using learning-based techniques. We first address the problem of RGB-thermal camera calibration by proposing a passive calibration target and procedure that is both portable and easy to use. Second, we present PST900, a dataset of 894 synchronized and calibrated RGB and Thermal image pairs with per pixel human annotations across four distinct classes from the DARPA Subterranean Challenge. Lastly, we propose a CNN architecture for fast semantic segmentation that combines both RGB and Thermal imagery in a way that leverages RGB imagery independently. We compare our method against the state-of-the-art and show that our method outperforms them in our dataset.

* 6 pages 
Viaarxiv icon

DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance

Apr 10, 2019
Yilun Zhang, Ty Nguyen, Ian D. Miller, Shreyas S. Shivakumar, Steven Chen, Camillo J. Taylor, Vijay Kumar

Figure 1 for DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Figure 2 for DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Figure 3 for DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Figure 4 for DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance

Depth estimation is an important capability for autonomous vehicles to understand and reconstruct 3D environments as well as avoid obstacles during the execution. Accurate depth sensors such as LiDARs are often heavy, expensive and can only provide sparse depth while lighter depth sensors such as stereo cameras are noiser in comparison. We propose an end-to-end learning algorithm that is capable of using sparse, noisy input depth for refinement and depth completion. Our model also produces the camera pose as a byproduct, making it a great solution for autonomous systems. We evaluate our approach on both indoor and outdoor datasets. Empirical results show that our method performs well on the KITTI~\cite{kitti_geiger2012we} dataset when compared to other competing methods, while having superior performance in dealing with sparse, noisy input depth on the TUM~\cite{sturm12iros} dataset.

Viaarxiv icon

Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association

Mar 18, 2019
Xu Liu, Steven W. Chen, Chenhao Liu, Shreyas S. Shivakumar, Jnaneshwar Das, Camillo J. Taylor, James Underwood, Vijay Kumar

Figure 1 for Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association
Figure 2 for Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association
Figure 3 for Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association
Figure 4 for Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association

We present a cheap, lightweight, and fast fruit counting pipeline that uses a single monocular camera. Our pipeline that relies only on a monocular camera, achieves counting performance comparable to state-of-the-art fruit counting system that utilizes an expensive sensor suite including LiDAR and GPS/INS on a mango dataset. Our monocular camera pipeline begins with a fruit detection component that uses a deep neural network. It then uses semantic structure from motion (SFM) to convert these detections into fruit counts by estimating landmark locations of the fruit in 3D, and using these landmarks to identify double counting scenarios. There are many benefits of developing a low cost and lightweight fruit counting system, including applicability to agriculture in developing countries, where monetary constraints or unstructured environments necessitate cheaper hardware solutions.

* Accepted in IEEE Robotics and Automation Letters (RA-L), 8 pages 
Viaarxiv icon

DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion

Feb 02, 2019
Shreyas S. Shivakumar, Ty Nguyen, Steven W. Chen, Camillo J. Taylor

Figure 1 for DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion
Figure 2 for DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion
Figure 3 for DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion
Figure 4 for DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion

In this paper we propose a convolutional neural network that is designed to upsample a series of sparse range measurements based on the contextual cues gleaned from a high resolution intensity image. Our approach draws inspiration from related work on super-resolution and in-painting. We propose a novel architecture that seeks to pull contextual cues separately from the intensity image and the depth features and then fuse them later in the network. We argue that this approach effectively exploits the relationship between the two modalities and produces accurate results while respecting salient image structures. We present experimental results to demonstrate that our approach is comparable with state of the art methods and generalizes well across multiple datasets.

* 11 pages 
Viaarxiv icon

Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements

Sep 20, 2018
Shreyas S. Shivakumar, Kartik Mohta, Bernd Pfrommer, Vijay Kumar, Camillo J. Taylor

Figure 1 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 2 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 3 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 4 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements

We present an approach to depth estimation that fuses information from a stereo pair with sparse range measurements derived from a LIDAR sensor or a range camera. The goal of this work is to exploit the complementary strengths of the two sensor modalities, the accurate but sparse range measurements and the ambiguous but dense stereo information. These two sources are effectively and efficiently fused by combining ideas from anisotropic diffusion and semi-global matching. We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets, using randomly sampled ground truth range measurements as our sparse depth input. We achieve significant performance improvements with a small fraction of range measurements on both datasets. We also provide qualitative results from our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity levels.

* 7 pages, 5 figures, 2 tables 
Viaarxiv icon

The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots

Sep 20, 2018
Morgan Quigley, Kartik Mohta, Shreyas S. Shivakumar, Michael Watterson, Yash Mulgaonkar, Mikael Arguedas, Ke Sun, Sikang Liu, Bernd Pfrommer, Vijay Kumar, Camillo J. Taylor

Figure 1 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 2 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 3 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 4 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots

In this paper we describe the Open Vision Computer (OVC) which was designed to support high speed, vision guided autonomous drone flight. In particular our aim was to develop a system that would be suitable for relatively small-scale flying platforms where size, weight, power consumption and computational performance were all important considerations. This manuscript describes the primary features of our OVC system and explains how they are used to support fully autonomous indoor and outdoor exploration and navigation operations on our Falcon 250 quadrotor platform.

* 7 pages, 13 figures, conference 
Viaarxiv icon