Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sebastian Scherer

SubT-MRS: A Subterranean, Multi-Robot, Multi-Spectral and Multi-Degraded Dataset for Robust SLAM

Aug 02, 2023

Shibo Zhao, Tianhao Wu, YuanJun Gao, Damanpreet Singh, Rushan Jiang, Haoxiang Sun, Jay Karhade, Ian Higgins, Chuck Whittaker, Lucas Nogueira(+10 more)

Figure 1 for SubT-MRS: A Subterranean, Multi-Robot, Multi-Spectral and Multi-Degraded Dataset for Robust SLAM

Figure 2 for SubT-MRS: A Subterranean, Multi-Robot, Multi-Spectral and Multi-Degraded Dataset for Robust SLAM

Figure 3 for SubT-MRS: A Subterranean, Multi-Robot, Multi-Spectral and Multi-Degraded Dataset for Robust SLAM

Figure 4 for SubT-MRS: A Subterranean, Multi-Robot, Multi-Spectral and Multi-Degraded Dataset for Robust SLAM

Abstract:In recent years, significant progress has been made in the field of simultaneous localization and mapping (SLAM) research. However, current state-of-the-art solutions still struggle with limited accuracy and robustness in real-world applications. One major reason is the lack of datasets that fully capture the conditions faced by robots in the wild. To address this problem, we present SubT-MRS, an extremely challenging real-world dataset designed to push the limits of SLAM and perception algorithms. SubT-MRS is a multi-modal, multi-robot dataset collected mainly from subterranean environments having multi-degraded conditions including structureless corridors, varying lighting conditions, and perceptual obscurants such as smoke and dust. Furthermore, the dataset packages information from a diverse range of time-synchronized sensors, including LiDAR, visual cameras, thermal cameras, and IMUs captured using varied vehicular motions like aerial, legged, and wheeled, to support research in sensor fusion, which is essential for achieving accurate and robust robotic perception in complex environments. To evaluate the accuracy of SLAM systems, we also provide a dense 3D model with sub-centimeter-level accuracy, as well as accurate 6DoF ground truth. Our benchmarking approach includes several state-of-the-art methods to demonstrate the challenges our datasets introduce, particularly in the case of multi-degraded environments.

Via

Access Paper or Ask Questions

AnyLoc: Towards Universal Visual Place Recognition

Aug 01, 2023

Nikhil Keetha, Avneesh Mishra, Jay Karhade, Krishna Murthy Jatavallabhula, Sebastian Scherer, Madhava Krishna, Sourav Garg

Abstract:Visual Place Recognition (VPR) is vital for robot localization. To date, the most performant VPR approaches are environment- and task-specific: while they exhibit strong performance in structured environments (predominantly urban driving), their performance degrades severely in unstructured environments, rendering most approaches brittle to robust real-world deployment. In this work, we develop a universal solution to VPR -- a technique that works across a broad range of structured and unstructured environments (urban, outdoors, indoors, aerial, underwater, and subterranean environments) without any re-training or fine-tuning. We demonstrate that general-purpose feature representations derived from off-the-shelf self-supervised models with no VPR-specific training are the right substrate upon which to build such a universal VPR solution. Combining these derived features with unsupervised feature aggregation enables our suite of methods, AnyLoc, to achieve up to 4X significantly higher performance than existing approaches. We further obtain a 6% improvement in performance by characterizing the semantic properties of these features, uncovering unique domains which encapsulate datasets from similar environments. Our detailed experiments and analysis lay a foundation for building VPR solutions that may be deployed anywhere, anytime, and across anyview. We encourage the readers to explore our project page and interactive demos: https://anyloc.github.io/.

Via

Access Paper or Ask Questions

Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Spherical Decomposition

Jul 27, 2023

Seungchan Kim, Micah Corah, John Keller, Graeme Best, Sebastian Scherer

Figure 1 for Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Spherical Decomposition

Figure 2 for Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Spherical Decomposition

Figure 3 for Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Spherical Decomposition

Figure 4 for Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Spherical Decomposition

Abstract:This work proposes an autonomous multi-robot exploration pipeline that coordinates the behaviors of robots in an indoor environment composed of multiple rooms. Contrary to simple frontier-based exploration approaches, we aim to enable robots to methodically explore and observe an unknown set of rooms in a structured building, keeping track of which rooms are already explored and sharing this information among robots to coordinate their behaviors in a distributed manner. To this end, we propose (1) a geometric cue extraction method that processes 3D map point cloud data and detects the locations of potential cues such as doors and rooms, (2) a spherical decomposition for open spaces used for target assignment. Using these two components, our pipeline effectively assigns tasks among robots, and enables a methodical exploration of rooms. We evaluate the performance of our pipeline using a team of up to 3 aerial robots, and show that our method outperforms the baseline by 36.6% in simulation and 26.4% in real-world experiments.

Via

Access Paper or Ask Questions

Pegasus Simulator: An Isaac Sim Framework for Multiple Aerial Vehicles Simulation

Jul 11, 2023

Marcelo Jacinto, João Pinto, Jay Patrikar, John Keller, Rita Cunha, Sebastian Scherer, António Pascoal

Abstract:Developing and testing novel control and motion planning algorithms for aerial vehicles can be a challenging task, with the robotics community relying more than ever on 3D simulation technologies to evaluate the performance of new algorithms in a variety of conditions and environments. In this work, we introduce the Pegasus Simulator, a modular framework implemented as an NVIDIA Isaac Sim extension that enables real-time simulation of multiple multirotor vehicles in photo-realistic environments, while providing out-of-the-box integration with the widely adopted PX4-Autopilot and ROS2 through its modular implementation and intuitive graphical user interface. To demonstrate some of its capabilities, a nonlinear controller was implemented and simulation results for two drones performing aggressive flight maneuvers are presented. Code and documentation for this framework are also provided as supplementary material.

Via

Access Paper or Ask Questions

Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

Jun 28, 2023

Guanqi He, Yash Jangir, Junyi Geng, Mohammadreza Mousaei, Dongwei Bai, Sebastian Scherer

Figure 1 for Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

Figure 2 for Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

Figure 3 for Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

Figure 4 for Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

Abstract:Using Unmanned Aerial Vehicles (UAVs) to perform high-altitude manipulation tasks beyond just passive visual application can reduce the time, cost, and risk of human workers. Prior research on aerial manipulation has relied on either ground truth state estimate or GPS/total station with some Simultaneous Localization and Mapping (SLAM) algorithms, which may not be practical for many applications close to infrastructure with degraded GPS signal or featureless environments. Visual servo can avoid the need to estimate robot pose. Existing works on visual servo for aerial manipulation either address solely end-effector position control or rely on precise velocity measurement and pre-defined visual visual marker with known pattern. Furthermore, most of previous work used under-actuated UAVs, resulting in complicated mechanical and hence control design for the end-effector. This paper develops an image-based visual servo control strategy for bridge maintenance using a fully-actuated UAV. The main components are (1) a visual line detection and tracking system, (2) a hybrid impedance force and motion control system. Our approach does not rely on either robot pose/velocity estimation from an external localization system or pre-defined visual markers. The complexity of the mechanical system and controller architecture is also minimized due to the fully-actuated nature. Experiments show that the system can effectively execute motion tracking and force holding using only the visual guidance for the bridge painting. To the best of our knowledge, this is one of the first studies on aerial manipulation using visual servo that is capable of achieving both motion and force control without the need of external pose/velocity information or pre-defined visual guidance.

* Accepted by 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Via

Access Paper or Ask Questions

Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models

Jun 27, 2023

Eric Chen, Cherie Ho, Mukhtar Maulimov, Chen Wang, Sebastian Scherer

Figure 1 for Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models

Figure 2 for Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models

Figure 3 for Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models

Figure 4 for Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models

Abstract:Autonomous off-road driving requires understanding traversability, which refers to the suitability of a given terrain to drive over. When offroad vehicles travel at high speed ($>10m/s$), they need to reason at long-range ($50m$-$100m$) for safe and deliberate navigation. Moreover, vehicles often operate in new environments and under different weather conditions. LiDAR provides accurate estimates robust to visual appearances, however, it is often too noisy beyond 30m for fine-grained estimates due to sparse measurements. Conversely, visual-based models give dense predictions at further distances but perform poorly at all ranges when out of training distribution. To address these challenges, we present ALTER, an offroad perception module that adapts-on-the-drive to combine the best of both sensors. Our visual model continuously learns from new near-range LiDAR measurements. This self-supervised approach enables accurate long-range traversability prediction in novel environments without hand-labeling. Results on two distinct real-world offroad environments show up to 52.5% improvement in traversability estimation over LiDAR-only estimates and 38.1% improvement over non-adaptive visual baseline.

* 8 pages

Via

Access Paper or Ask Questions

Time-Optimal Path Planning in a Constant Wind for Uncrewed Aerial Vehicles using Dubins Set Classification

Jun 20, 2023

Sagar Sachdev, Brady Moon, Junbin Yuan, Sebastian Scherer

Abstract:Time-optimal path planning in high winds for a turning rate constrained UAV is a challenging problem to solve and is important for deployment and field operations. Previous works have used trochoidal path segments, which consist of straight and maximum-rate turn segments, as optimal extremal paths in uniform wind conditions. Current methods iterate over all candidate trochoidal trajectory types and choose the time-optimal one; however, this exhaustive search can be computationally slow. In this paper we present a method to decrease the computation time. We achieve this via a geometric approach to reduce the candidate trochoidal trajectory types by framing the problem in the air-relative frame and bounding the solution within a subset of candidate trajectories. This method reduces overall computation by 37.4% compared to pre-existing methods in Bang-Straight-Bang trajectories, freeing up computation for other onboard processes and can lead to significant total computational reductions when solving many trochoidal paths. When used within the framework of a global path planner, faster state expansions help find solutions faster or compute higher-quality paths. We also release our open-source codebase as a C++ package.

* 8 pages, 7 figures, Submitted to RA-L

Via

Access Paper or Ask Questions

VoxDet: Voxel Learning for Novel Instance Detection

Jun 04, 2023

Bowen Li, Jiashun Wang, Yaoyu Hu, Chen Wang, Sebastian Scherer

Abstract:Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel representation and reliable voxel matching mechanism. VoxDet first ingeniously proposes template voxel aggregation (TVA) module, effectively transforming multi-view 2D images into 3D voxel features. By leveraging associated camera poses, these features are aggregated into a compact 3D template voxel. In novel instance detection, this voxel representation demonstrates heightened resilience to occlusion and pose variations. We also discover that a 3D reconstruction objective helps to pre-train the 2D-3D mapping in TVA. Second, to quickly align with the template voxel, VoxDet incorporates a Query Voxel Matching (QVM) module. The 2D queries are first converted into their voxel representation with the learned 2D-3D mapping. We find that since the 3D voxel representations encode the geometry, we can first estimate the relative rotation and then compare the aligned voxels, leading to improved accuracy and efficiency. Exhaustive experiments are conducted on the demanding LineMod-Occlusion, YCB-video, and the newly built RoboTools benchmarks, where VoxDet outperforms various 2D baselines remarkably with 20% higher recall and faster speed. To the best of our knowledge, VoxDet is the first to incorporate implicit 3D knowledge for 2D detection tasks.

* 17 pages, 10 figures

Via

Access Paper or Ask Questions

A Simulator for Fully-Actuated UAVs

May 12, 2023

Azarakhsh Keipour, Mohammadreza Mousaei, Sebastian Scherer

Abstract:This workshop paper presents the challenges we encountered when simulating fully-actuated Unmanned Aerial Vehicles (UAVs) for our research and the solutions we developed to overcome the challenges. We describe the ARCAD simulator that has helped us rapidly implement and test different controllers ranging from Hybrid Force-Position Controllers to advanced Model Predictive Path Integrals and has allowed us to analyze the design and behavior of different fully-actuated UAVs. We used the simulator to enable real-world deployments of our fully-actuated UAV fleet for different applications. The simulator is further extended to support the physical interaction of UAVs with their environment and allow more UAV designs, such as hybrid VTOLs. The code for the simulator can be accessed from https://github.com/keipour/aircraft-simulator-matlab.

* The 2023 International Conference on Robotics and Automation (ICRA), Workshop on The Role of Robotics Simulators for Unmanned Aerial Vehicles

Via

Access Paper or Ask Questions

Learned Tree Search for Long-Horizon Social Robot Navigation in Shared Airspace

Apr 04, 2023

Ingrid Navarro, Jay Patrikar, Joao P. A. Dantas, Rohan Baijal, Ian Higgins, Sebastian Scherer, Jean Oh

Abstract:The fast-growing demand for fully autonomous aerial operations in shared spaces necessitates developing trustworthy agents that can safely and seamlessly navigate in crowded, dynamic spaces. In this work, we propose Social Robot Tree Search (SoRTS), an algorithm for the safe navigation of mobile robots in social domains. SoRTS aims to augment existing socially-aware trajectory prediction policies with a Monte Carlo Tree Search planner for improved downstream navigation of mobile robots. To evaluate the performance of our method, we choose the use case of social navigation for general aviation. To aid this evaluation, within this work, we also introduce X-PlaneROS, a high-fidelity aerial simulator, to enable more research in full-scale aerial autonomy. By conducting a user study based on the assessments of 26 FAA certified pilots, we show that SoRTS performs comparably to a competent human pilot, significantly outperforming our baseline algorithm. We further complement these results with self-play experiments in scenarios with increasing complexity.

* 8 Pages, 3 Figs, 4 Tables

Via

Access Paper or Ask Questions