Abstract:Recent advances in robotics are pushing real-world autonomy, enabling robots to perform long-term and large-scale missions. A crucial component for successful missions is the incorporation of loop closures through place recognition, which effectively mitigates accumulated pose estimation drift. Despite computational advancements, optimizing performance for real-time deployment remains challenging, especially in resource-constrained mobile robots and multi-robot systems since, conventional keyframe sampling practices in place recognition often result in retaining redundant information or overlooking relevant data, as they rely on fixed sampling intervals or work directly in the 3D space instead of the feature space. To address these concerns, we introduce the concept of sample space in place recognition and demonstrate how different sampling techniques affect the query process and overall performance. We then present a novel keyframe sampling approach for LiDAR-based place recognition, which focuses on redundancy minimization and information preservation in the hyper-dimensional descriptor space. This approach is applicable to both learning-based and handcrafted descriptors, and through the experimental validation across multiple datasets and descriptor frameworks, we demonstrate the effectiveness of our proposed method, showing it can jointly minimize redundancy and preserve essential information in real-time. The proposed approach maintains robust performance across various datasets without requiring parameter tuning, contributing to more efficient and reliable place recognition for a wide range of robotic applications.
Abstract:This article studies the commonsense object affordance concept for enabling close-to-human task planning and task optimization of embodied robotic agents in urban environments. The focus of the object affordance is on reasoning how to effectively identify object's inherent utility during the task execution, which in this work is enabled through the analysis of contextual relations of sparse information of 3D scene graphs. The proposed framework develops a Correlation Information (CECI) model to learn probability distributions using a Graph Convolutional Network, allowing to extract the commonsense affordance for individual members of a semantic class. The overall framework was experimentally validated in a real-world indoor environment, showcasing the ability of the method to level with human commonsense. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/BDCMVx2GiQE
Abstract:Object detection and global localization play a crucial role in robotics, spanning across a great spectrum of applications from autonomous cars to multi-layered 3D Scene Graphs for semantic scene understanding. This article proposes BOX3D, a novel multi-modal and lightweight scheme for localizing objects of interest by fusing the information from RGB camera and 3D LiDAR. BOX3D is structured around a three-layered architecture, building up from the local perception of the incoming sequential sensor data to the global perception refinement that covers for outliers and the general consistency of each object's observation. More specifically, the first layer handles the low-level fusion of camera and LiDAR data for initial 3D bounding box extraction. The second layer converts each LiDAR's scan 3D bounding boxes to the world coordinate frame and applies a spatial pairing and merging mechanism to maintain the uniqueness of objects observed from different viewpoints. Finally, BOX3D integrates the third layer that supervises the consistency of the results on the global map iteratively, using a point-to-voxel comparison for identifying all points in the global map that belong to the object. Benchmarking results of the proposed novel architecture are showcased in multiple experimental trials on public state-of-the-art large-scale dataset of urban environments.
Abstract:In this article, we propose the novel concept of Belief Scene Graphs, which are utility-driven extensions of partial 3D scene graphs, that enable efficient high-level task planning with partial information. We propose a graph-based learning methodology for the computation of belief (also referred to as expectation) on any given 3D scene graph, which is then used to strategically add new nodes (referred to as blind nodes) that are relevant for a robotic mission. We propose the method of Computation of Expectation based on Correlation Information (CECI), to reasonably approximate real Belief/Expectation, by learning histograms from available training data. A novel Graph Convolutional Neural Network (GCN) model is developed, to learn CECI from a repository of 3D scene graphs. As no database of 3D scene graphs exists for the training of the novel CECI model, we present a novel methodology for generating a 3D scene graph dataset based on semantically annotated real-life 3D spaces. The generated dataset is then utilized to train the proposed CECI model and for extensive validation of the proposed method. We establish the novel concept of \textit{Belief Scene Graphs} (BSG), as a core component to integrate expectations into abstract representations. This new concept is an evolution of the classical 3D scene graph concept and aims to enable high-level reasoning for the task planning and optimization of a variety of robotics missions. The efficacy of the overall framework has been evaluated in an object search scenario, and has also been tested on a real-life experiment to emulate human common sense of unseen-objects.
Abstract:In this article, we propose a novel navigation framework that leverages a two layered graph representation of the environment for efficient large-scale exploration, while it integrates a novel uncertainty awareness scheme to handle dynamic scene changes in previously explored areas. The framework is structured around a novel goal oriented graph representation, that consists of, i) the local sub-graph and ii) the global graph layer respectively. The local sub-graphs encode local volumetric gain locations as frontiers, based on the direct pointcloud visibility, allowing fast graph building and path planning. Additionally, the global graph is build in an efficient way, using node-edge information exchange only on overlapping regions of sequential sub-graphs. Different from the state-of-the-art graph based exploration methods, the proposed approach efficiently re-uses sub-graphs built in previous iterations to construct the global navigation layer. Another merit of the proposed scheme is the ability to handle scene changes (e.g. blocked pathways), adaptively updating the obstructed part of the global graph from traversable to not-traversable. This operation involved oriented sample space of a path segment in the global graph layer, while removing the respective edges from connected nodes of the global graph in cases of obstructions. As such, the exploration behavior is directing the robot to follow another route in the global re-positioning phase through path-way updates in the global graph. Finally, we showcase the performance of the method both in simulation runs as well as deployed in real-world scene involving a legged robot carrying camera and lidar sensor.
Abstract:In this article, we propose a novel LiDAR and event camera fusion modality for subterranean (SubT) environments for fast and precise object and human detection in a wide variety of adverse lighting conditions, such as low or no light, high-contrast zones and in the presence of blinding light sources. In the proposed approach, information from the event camera and LiDAR are fused to localize a human or an object-of-interest in a robot's local frame. The local detection is then transformed into the inertial frame and used to set references for a Nonlinear Model Predictive Controller (NMPC) for reactive tracking of humans or objects in SubT environments. The proposed novel fusion uses intensity filtering and K-means clustering on the LiDAR point cloud and frequency filtering and connectivity clustering on the events induced in an event camera by the returning LiDAR beams. The centroids of the clusters in the event camera and LiDAR streams are then paired to localize reflective markers present on safety vests and signs in SubT environments. The efficacy of the proposed scheme has been experimentally validated in a real SubT environment (a mine) with a Pioneer 3AT mobile robot. The experimental results show real-time performance for human detection and the NMPC-based controller allows for reactive tracking of a human or object of interest, even in complete darkness.
Abstract:In this paper, we study the multi-robot task assignment and path-finding problem (MRTAPF), where a number of agents are required to visit all given goal locations while avoiding collisions with each other. We propose a novel two-layer algorithm SA-reCBS that cascades the simulated annealing algorithm and conflict-based search to solve this problem. Compared to other approaches in the field of MRTAPF, the advantage of SA-reCBS is that without requiring a pre-bundle of goals to groups with the same number of groups as the number of robots, it enables a part of agents needed to visit all goals in collision-free paths. We test the algorithm in various simulation instances and compare it with state-of-the-art algorithms. The result shows that SA-reCBS has a better performance with a higher success rate, less computational time, and better objective values.
Abstract:Autonomous navigation of robots in harsh and GPS denied subterranean (SubT) environments with lack of natural or poor illumination is a challenging task that fosters the development of algorithms for pose estimation and mapping. Inspired by the need for real-life deployment of autonomous robots in such environments, this article presents an experimental comparative study of 3D SLAM algorithms. The study focuses on state-of-the-art Lidar SLAM algorithms with open-source implementation that are i) lidar-only like BLAM, LOAM, A-LOAM, ISC-LOAM and hdl graph slam, or ii) lidar-inertial like LeGO-LOAM, Cartographer, LIO-mapping and LIO-SAM. The evaluation of the methods is performed based on a dataset collected from the Boston Dynamics Spot robot equipped with 3D lidar Velodyne Puck Lite and IMU Vectornav VN-100, during a mission in an underground tunnel. In the evaluation process poses and 3D tunnel reconstructions from SLAM algorithms are compared against each other to find methods with most solid performance in terms of pose accuracy and map quality.
Abstract:Mapping and exploration of a Martian terrain with an aerial vehicle has become an emerging research direction, since the successful flight demonstration of the Mars helicopter Ingenuity. Although the autonomy and navigation capability of the state of the art Mars helicopter has proven to be efficient in an open environment, the next area of interest for exploration on Mars are caves or ancient lava tube like environments, especially towards the never-ending search of life on other planets. This article presents an autonomous exploration mission based on a modified frontier approach along with a risk aware planning and integrated collision avoidance scheme with a special focus on energy aspects of a custom designed Mars Coaxial Quadrotor (MCQ) in a Martian simulated lava tube. One of the biggest novelties of the article stems from addressing the exploration capability, while rapidly exploring in local areas and intelligently global re-positioning of the MCQ when reaching dead ends in order to to efficiently use the battery based consumed energy, while increasing the volume of the exploration. The proposed three layer cost based global re-position point selection assists in rapidly redirecting the MCQ to previously partially seen areas that could lead to more unexplored part of the lava tube. The Martian fully simulated mission presented in this article takes into consideration the fidelity of physics of Mars condition in terms of thin atmosphere, low surface pressure and low gravity of the planet, while proves the efficiency of the proposed scheme in exploring an area that is particularly challenging due to the subterranean-like environment. The proposed exploration-planning framework is also validated in simulation by comparing it against the graph based exploration planner.
Abstract:Exploration and mapping of unknown environments is a fundamental task in applications for autonomous robots. In this article, we present a complete framework for deploying MAVs in autonomous exploration missions in unknown subterranean areas. The main motive of exploration algorithms is to depict the next best frontier for the robot such that new ground can be covered in a fast, safe yet efficient manner. The proposed framework uses a novel frontier selection method that also contributes to the safe navigation of autonomous robots in obstructed areas such as subterranean caves, mines, and urban areas. The framework presented in this work bifurcates the exploration problem in local and global exploration. The proposed exploration framework is also adaptable according to computational resources available onboard the robot which means the trade-off between the speed of exploration and the quality of the map can be made. Such capability allows the proposed framework to be deployed in a subterranean exploration, mapping as well as in fast search and rescue scenarios. The overall system is considered a low-complexity and baseline solution for navigation and object localization in tunnel-like environments. The performance of the proposed framework is evaluated in detailed simulation studies with comparisons made against a high-level exploration-planning framework developed for the DARPA Sub-T challenge as it will be presented in this article.