Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sebastian Scherer

Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020

Jul 03, 2021

Anish Bhattacharya, Akshit Gandhi, Lukas Merkle, Rohan Tiwari, Karun Warrior, Stanley Winata, Andrew Saba, Kevin Zhang, Oliver Kroemer, Sebastian Scherer

Figure 1 for Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020

Figure 2 for Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020

Figure 3 for Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020

Figure 4 for Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020

Abstract:For robotics systems to be used in high risk, real-world situations, they have to be quickly deployable and robust to environmental changes, under-performing hardware, and mission subtask failures. Robots are often designed to consider a single sequence of mission events, with complex algorithms lowering individual subtask failure rates under some critical constraints. Our approach is to leverage common techniques in vision and control and encode robustness into mission structure through outcome monitoring and recovery strategies, aided by a system infrastructure that allows for quick mission deployments under tight time constraints and no central communication. We also detail lessons in rapid field robotics development and testing. Systems were developed and evaluated through real-robot experiments at an outdoor test site in Pittsburgh, Pennsylvania, USA, as well as in the 2020 Mohamed Bin Zayed International Robotics Challenge. All competition trials were completed in fully autonomous mode without RTK-GPS. Our system led to 4th place in Challenge 2 and 7th place in the Grand Challenge, and achievements like popping five balloons (Challenge 1), successfully picking and placing a block (Challenge 2), and dispensing the most water autonomously with a UAV of all teams onto an outdoor, real fire (Challenge 3).

* 28 pages, 26 figures. To appear in Field Robotics, Special Issues on MBZIRC 2020

Via

Access Paper or Ask Questions

3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

Jun 06, 2021

Peng Yin, Lingyun Xu, Jianmin Ji, Sebastian Scherer, Howie Choset

Figure 1 for 3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

Figure 2 for 3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

Figure 3 for 3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

Figure 4 for 3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

Abstract:One of the main obstacles to 3D semantic segmentation is the significant amount of endeavor required to generate expensive point-wise annotations for fully supervised training. To alleviate manual efforts, we propose GIDSeg, a novel approach that can simultaneously learn segmentation from sparse annotations via reasoning global-regional structures and individual-vicinal properties. GIDSeg depicts global- and individual- relation via a dynamic edge convolution network coupled with a kernelized identity descriptor. The ensemble effects are obtained by endowing a fine-grained receptive field to a low-resolution voxelized map. In our GIDSeg, an adversarial learning module is also designed to further enhance the conditional constraint of identity descriptors within the joint feature distribution. Despite the apparent simplicity, our proposed approach achieves superior performance over state-of-the-art for inferencing 3D dense segmentation with only sparse annotations. Particularly, with $5\%$ annotations of raw data, GIDSeg outperforms other 3D segmentation methods.

* 8 pages, 7 figures, Accepted in IEEE Robotics and Automation Letters, 2021

Via

Access Paper or Ask Questions

i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

Jun 06, 2021

Peng Yin, Lingyun Xu, Ji Zhang, Howie Choset, Sebastian Scherer

Figure 1 for i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

Figure 2 for i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

Figure 3 for i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

Figure 4 for i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

Abstract:We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors. Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors. We evaluate our method on extensive self-collected datasets, which involve \textit{Long-term} (variant appearance conditions), \textit{Large-scale} (up to $2km$ structure/unstructured environment), and \textit{Multistory} (four-floor confined space). Our method surpasses other current state-of-the-arts by achieving around $3$ times higher place retrievals to inconsistent environments, and above $3$ times accuracy on online localization. To highlight our method's generalization capabilities, we also evaluate the recognition across different datasets. With a single trained model, i3dLoc can demonstrate reliable visual localization in random conditions.

* Robotics: Science and Systems 2021
* 8 Pages, 8 Figures, Accepted Robotics: Science and Systems 2021 paper

Via

Access Paper or Ask Questions

CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model

May 31, 2021

Arnav Choudhry, Brady Moon, Jay Patrikar, Constantine Samaras, Sebastian Scherer

Figure 1 for CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model

Figure 2 for CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model

Figure 3 for CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model

Figure 4 for CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model

Abstract:Energy management is a critical aspect of risk assessment for Uncrewed Aerial Vehicle (UAV) flights, as a depleted battery during a flight brings almost guaranteed vehicle damage and a high risk of human injuries or property damage. Predicting the amount of energy a flight will consume is challenging as routing, weather, obstacles, and other factors affect the overall consumption. We develop a deep energy model for a UAV that uses Temporal Convolutional Networks to capture the time varying features while incorporating static contextual information. Our energy model is trained on a real world dataset and does not require segregating flights into regimes. We illustrate an improvement in power predictions by $29\%$ on test flights when compared to a state-of-the-art analytical method. Using the energy model, we can predict the energy usage for a given trajectory and evaluate the risk of running out of battery during flight. We propose using Conditional Value-at-Risk (CVaR) as a metric for quantifying this risk. We show that CVaR captures the risk associated with worst-case energy consumption on a nominal path by transforming the output distribution of Monte Carlo forward simulations into a risk space. Computing the CVaR on the risk-space distribution provides a metric that can evaluate the overall risk of a flight before take-off. Our energy model and risk evaluation method can improve flight safety and evaluate the coverage area from a proposed takeoff location. The video and codebase are available at https://youtu.be/PHXGigqilOA and https://git.io/cvar-risk .

* 7 pages, 8 figures, Submitted ICRA 2021

Via

Access Paper or Ask Questions

VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure

May 10, 2021

Delong Zhu, Chaoqun Wang, Wenshan Wang, Rohit Garg, Sebastian Scherer, Max Q. -H. Meng

Figure 1 for VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure

Figure 2 for VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure

Figure 3 for VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure

Figure 4 for VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure

Abstract:This paper presents a fundamental algorithm, called VDB-EDT, for Euclidean distance transform (EDT) based on the VDB data structure. The algorithm executes on grid maps and generates the corresponding distance field for recording distance information against obstacles, which forms the basis of numerous motion planning algorithms. The contributions of this work mainly lie in three folds. Firstly, we propose a novel algorithm that can facilitate distance transform procedures by optimizing the scheduling priorities of transform functions, which significantly improves the running speed of conventional EDT algorithms. Secondly, we for the first time introduce the memory-efficient VDB data structure, a customed B+ tree, to represent the distance field hierarchically. Benefiting from the special index and caching mechanism, VDB shows a fast (average \textit{O}(1)) random access speed, and thus is very suitable for the frequent neighbor-searching operations in EDT. Moreover, regarding the small scale of existing datasets, we release a large-scale dataset captured from subterranean environments to benchmark EDT algorithms. Extensive experiments on the released dataset and publicly available datasets show that VDB-EDT can reduce memory consumption by about 30%-85%, depending on the sparsity of the environment, while maintaining a competitive running speed with the fastest array-based implementation. The experiments also show that VDB-EDT can significantly outperform the state-of-the-art EDT algorithm in both runtime and memory efficiency, which strongly demonstrates the advantages of our proposed method. The released dataset and source code are available on https://github.com/zhudelong/VDB-EDT.

Via

Access Paper or Ask Questions

Super Odometry: IMU-centric LiDAR-Visual-Inertial Estimator for Challenging Environments

Apr 30, 2021

Shibo Zhao, Hengrui Zhang, Peng Wang, Lucas Nogueira, Sebastian Scherer

Figure 1 for Super Odometry: IMU-centric LiDAR-Visual-Inertial Estimator for Challenging Environments

Figure 2 for Super Odometry: IMU-centric LiDAR-Visual-Inertial Estimator for Challenging Environments

Figure 3 for Super Odometry: IMU-centric LiDAR-Visual-Inertial Estimator for Challenging Environments

Figure 4 for Super Odometry: IMU-centric LiDAR-Visual-Inertial Estimator for Challenging Environments

Abstract:We propose Super Odometry, a high-precision multi-modal sensor fusion framework, providing a simple but effective way to fuse multiple sensors such as LiDAR, camera, and IMU sensors and achieve robust state estimation in perceptually-degraded environments. Different from traditional sensor-fusion methods, Super Odometry employs an IMU-centric data processing pipeline, which combines the advantages of loosely coupled methods with tightly coupled methods and recovers motion in a coarse-to-fine manner. The proposed framework is composed of three parts: IMU odometry, visual-inertial odometry, and laser-inertial odometry. The visual-inertial odometry and laser-inertial odometry provide the pose prior to constrain the IMU bias and receive the motion prediction from IMU odometry. To ensure high performance in real-time, we apply a dynamic octree that only consumes 10 % of the running time compared with a static KD-tree. The proposed system was deployed on drones and ground robots, as part of Team Explorer's effort to the DARPA Subterranean Challenge where the team won $1^{st}$ and $2^{nd}$ place in the Tunnel and Urban Circuits, respectively.

Via

Access Paper or Ask Questions

Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle

Apr 02, 2021

Azarakhsh Keipour, Guilherme A. S. Pereira, Rogerio Bonatti, Rohit Garg, Puru Rastogi, Geetesh Dubey, Sebastian Scherer

Figure 1 for Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle

Figure 2 for Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle

Figure 3 for Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle

Figure 4 for Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle

Abstract:We present a method to autonomously land an Unmanned Aerial Vehicle on a moving vehicle with a circular (or elliptical) pattern on the top. A visual servoing controller approaches the ground vehicle using velocity commands calculated directly in image space. The control laws generate velocity commands in all three dimensions, eliminating the need for a separate height controller. The method has shown the ability to approach and land on the moving deck in simulation, indoor and outdoor environments, and compared to the other available methods, it has provided the fastest landing approach. It does not rely on additional external setup, such as RTK, motion capture system, ground station, offboard processing, or communication with the vehicle, and it requires only a minimal set of hardware and localization sensors. The videos and source codes can be accessed from http://theairlab.org/landing-on-vehicle.

* 24 pages

Via

Access Paper or Ask Questions

Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Mar 31, 2021

Fan Yang, Dung-Han Lee, John Keller, Sebastian Scherer

Figure 1 for Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Figure 2 for Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Figure 3 for Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Figure 4 for Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Abstract:Currently, state-of-the-art exploration methods maintain high-resolution map representations in order to optimize exploration goals in each step that maximizes information gain. However, during exploring, those "optimal" selections could quickly become obsolete due to the influx of new information, especially in large-scale environments, and result in high-frequency re-planning that hinders the overall exploration efficiency. In this paper, we propose a graph-based topological planning framework, building a sparse topological map in three-dimensional (3D) space to guide exploration steps with high-level intents so as to render consistent exploration maneuvers. Specifically, this work presents a novel method to estimate 3D space's geometry with convex polyhedrons. Then, the geometry information is utilized to group space into distinctive regions. And those regions are added as nodes into the topological map, directing the exploration process. We compared our method with the state-of-the-art in simulated environments. The proposed method achieves higher space coverage and outperforms exploration efficiency by more than 40% during experiments. Finally, a field experiment was conducted to further evaluate the applicability of our method to empower efficient and robust exploration in real-world environments.

* Preprint version for ICRA2021 final submission

Via

Access Paper or Ask Questions

In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery

Mar 24, 2021

Thiago A. Rodrigues, Jay Patrikar, Arnav Choudhry, Jacob Feldgoise, Vaibhav Arcot, Aradhana Gahlaut, Sophia Lau, Brady Moon, Bastian Wagner, H. Scott Matthews(+2 more)

Figure 1 for In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery

Figure 2 for In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery

Figure 3 for In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery

Figure 4 for In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery

Abstract:We autonomously direct a small quadcopter package delivery Uncrewed Aerial Vehicle (UAV) or "drone" to take off, fly a specified route, and land for a total of 209 flights while varying a set of operational parameters. The vehicle was equipped with onboard sensors, including GPS, IMU, voltage and current sensors, and an ultrasonic anemometer, to collect high-resolution data on the inertial states, wind speed, and power consumption. Operational parameters, such as commanded ground speed, payload, and cruise altitude, are varied for each flight. This large data set has a total flight time of 10 hours and 45 minutes and was collected from April to October of 2019 covering a total distance of approximately 65 kilometers. The data collected were validated by comparing flights with similar operational parameters. We believe these data will be of great interest to the research and industrial communities, who can use the data to improve UAV designs, safety, and energy efficiency, as well as advance the physical understanding of in-flight operations for package delivery drones.

* 13 pages, 11 figures, submitted to Scientific Data

Via

Access Paper or Ask Questions

ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images

Mar 13, 2021

Yaoyu Hu, Wenshan Wang, Huai Yu, Weikun Zhen, Sebastian Scherer

Figure 1 for ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images

Figure 2 for ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images

Figure 3 for ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images

Figure 4 for ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images

Abstract:Stereo reconstruction models trained on small images do not generalize well to high-resolution data. Training a model on high-resolution image size faces difficulties of data availability and is often infeasible due to limited computing resources. In this work, we present the Occlusion-aware Recurrent binocular Stereo matching (ORStereo), which deals with these issues by only training on available low disparity range stereo images. ORStereo generalizes to unseen high-resolution images with large disparity ranges by formulating the task as residual updates and refinements of an initial prediction. ORStereo is trained on images with disparity ranges limited to 256 pixels, yet it can operate 4K-resolution input with over 1000 disparities using limited GPU memory. We test the model's capability on both synthetic and real-world high-resolution images. Experimental results demonstrate that ORStereo achieves comparable performance on 4K-resolution images compared to state-of-the-art methods trained on large disparity ranges. Compared to other methods that are only trained on low-resolution images, our method is 70% more accurate on 4K-resolution images.

* Submitted to International Conference on Intelligent Robots and Systems (IROS) 2021

Via

Access Paper or Ask Questions