In robot localisation and mapping, outliers are unavoidable when loop-closure measurements are taken into account. A single false-positive loop-closure can have a very negative impact on SLAM problems causing an inferior trajectory to be produced or even for the optimisation to fail entirely. To address this issue, popular existing approaches define a hard switch for each loop-closure constraint. This paper presents AEROS, a novel approach to adaptively solve a robust least-squares minimisation problem by adding just a single extra latent parameter. It can be used in the back-end component of the SLAM problem to enable generalised robust cost minimisation by simultaneously estimating the continuous latent parameter along with the set of sensor poses in a single joint optimisation. This leads to a very closely curve fitting on the distribution of the residuals, thereby reducing the effect of outliers. Additionally, we formulate the robust optimisation problem using standard Gaussian factors so that it can be solved by direct application of popular incremental estimation approaches such as iSAM. Experimental results on publicly available synthetic datasets and real LiDAR-SLAM datasets collected from the 2D and 3D LiDAR systems show the competitiveness of our approach with the state-of-the-art techniques and its superiority on real world scenarios.
Retrieval-based place recognition is an efficient and effective solution for enabling re-localization within a pre-built map or global data association for Simultaneous Localization and Mapping (SLAM). The accuracy of such an approach is heavily dependent on the quality of the extracted scene-level representation. While end-to-end solutions, which learn a global descriptor from input point clouds, have demonstrated promising results, such approaches are limited in their ability to enforce desirable properties at the local feature level. In this paper, we demonstrate that the inclusion of an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits, hence leading to more repeatable global descriptors resulting in an overall improvement in place recognition performance. We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net. Experiments on two large-scale public benchmarks (KITTI and MulRan) show that our method achieves mean $F1_{max}$ scores of $0.939$ and $0.968$ on KITTI and MulRan, respectively while operating in near real-time.
While mobile LiDAR sensors are increasingly used to scan in ecology and forestry applications, reconstruction and characterisation are typically carried out offline (to the best of our knowledge). Motivated by this, we present an online LiDAR system which can run on a handheld device to segment and track individual trees and identify them in a fixed coordinate system. Segments relating to each tree are accumulated over time, and tree models are completed as more scans are captured from different perspectives. Using this reconstruction we then fit a cylinder model to each tree trunk by solving a least-squares optimisation over the points to estimate the Diameter at Breast Height (DBH) of the trees. Experimental results demonstrate that our system can estimate DBH to within $\sim$7 cm accuracy for 90% of individual trees in a forest (Wytham Woods, Oxford)
This paper presents novel strategies for spawning and fusing submaps within an elastic dense 3D reconstruction system. The proposed system uses spatial understanding of the scanned environment to control memory usage growth by fusing overlapping submaps in different ways. This allows the number of submaps and memory consumption to scale with the size of the environment rather than the duration of exploration. By analysing spatial overlap, our system segments distinct spaces, such as rooms and stairwells on the fly during exploration. Additionally, we present a new mathematical formulation of relative uncertainty between poses to improve the global consistency of the reconstruction. Performance is demonstrated using a multi-floor multi-room indoor experiment, a large-scale outdoor experiment and a simulated dataset. Relative to our baseline, the presented approach demonstrates improved scalability and accuracy.
In dynamic and cramped industrial environments, achieving reliable Visual Teach and Repeat (VT&R) with a single-camera is challenging. In this work, we develop a robust method for non-synchronized multi-camera VT&R. Our contribution are expected Camera Performance Models (CPM) which evaluate the camera streams from the teach step to determine the most informative one for localization during the repeat step. By actively selecting the most suitable camera for localization, we are able to successfully complete missions when one of the cameras is occluded, faces into feature poor locations or if the environment has changed. Furthermore, we explore the specific challenges of achieving VT&R on a dynamic quadruped robot, ANYmal. The camera does not follow a linear path (due to the walking gait and holonomicity) such that precise path-following cannot be achieved. Our experiments feature forward and backward facing stereo cameras showing VT&R performance in cluttered indoor and outdoor scenarios. We compared the trajectories the robot executed during the repeat steps demonstrating typical tracking precision of less than 10cm on average. With a view towards omni-directional localization, we show how the approach generalizes to four cameras in simulation. Video: https://youtu.be/iAY0lyjAnqY
We present an efficient, elastic 3D LiDAR reconstruction framework which can reconstruct up to maximum LiDAR ranges (60 m) at multiple frames per second, thus enabling robot exploration in large-scale environments. Our approach only requires a CPU. We focus on three main challenges of large-scale reconstruction: integration of long-range LiDAR scans at high frequency, the capacity to deform the reconstruction after loop closures are detected, and scalability for long-duration exploration. Our system extends upon a state-of-the-art efficient RGB-D volumetric reconstruction technique, called supereight, to support LiDAR scans and a newly developed submapping technique to allow for dynamic correction of the 3D reconstruction. We then introduce a novel pose graph sparsification and submap fusion feature to make our system more scalable for large environments. We evaluate the performance using a published dataset captured by a handheld mapping device scanning a set of buildings, and with a mobile robot exploring an underground room network. Experimental results demonstrate that our system can reconstruct at 3 Hz with 60 m sensor range and ~5 cm resolution, while state-of-the-art approaches can only reconstruct to 25 cm resolution or 20 m range at the same frequency.
Fingerprinting is a popular indoor localization technique since it can utilize existing infrastructures (e.g., access points). However, its site survey process is a labor-intensive and time-consuming task, which limits the application of such systems in practice. In this paper, motivated by the availability of advanced sensing capabilities in smartphones, we propose a fast and reliable fingerprint collection method to reduce the time and labor required for site survey. The proposed method uses a landmark graph-based method to automatically associate the collected fingerprints, which does not require active user participation. We will show that besides fast fingerprint data collection, the proposed method results in accurate location estimate compared to the state-of-the-art methods. Experimental results show that the proposed method is an order of magnitude faster than the manual fingerprint collection method, and using the radio map generated by our method achieves a much better accuracy compared to the existing methods.
In this paper we present a large dataset with a variety of mobile mapping sensors collected using a handheld device carried at typical walking speeds for nearly 2.2 km through New College, Oxford. The dataset includes data from two commercially available devices - a stereoscopic-inertial camera and a multi-beam 3D LiDAR, which also provides inertial measurements. Additionally, we used a tripod-mounted survey grade LiDAR scanner to capture a detailed millimeter-accurate 3D map of the test location (containing $\sim$290 million points). Using the map we inferred centimeter-accurate 6 Degree of Freedom (DoF) ground truth for the position of the device for each LiDAR scan to enable better evaluation of LiDAR and vision localisation, mapping and reconstruction systems. This ground truth is the particular novel contribution of this dataset and we believe that it will enable systematic evaluation which many similar datasets have lacked. The dataset combines both built environments, open spaces and vegetated areas so as to test localization and mapping systems such as vision-based navigation, visual and LiDAR SLAM, 3D LIDAR reconstruction and appearance-based place recognition. The dataset is available at: ori.ox.ac.uk/datasets/newer-college-dataset
In this paper, we develop an online active mapping system to enable a quadruped robot to autonomously survey large physical structures. We describe the perception, planning and control modules needed to scan and reconstruct an object of interest, without requiring a prior model. The system builds a voxel representation of the object, and iteratively determines the Next-Best-View (NBV) to extend the representation, according to both the reconstruction itself and to avoid collisions with the environment. By computing the expected information gain of a set of candidate scan locations sampled on the as-sensed terrain map, as well as the cost of reaching these candidates, the robot decides the NBV for further exploration. The robot plans an optimal path towards the NBV, avoiding obstacles and un-traversable terrain. Experimental results on both simulated and real-world environments show the capability and efficiency of our system. Finally we present a full system demonstration on the real robot, the ANYbotics ANYmal, autonomously reconstructing a building facade and an industrial structure.
In this paper, we present a factor-graph LiDAR-SLAM system which incorporates a state-of-the-art deeply learned feature-based loop closure detector to enable a legged robot to localize and map in industrial environments. These facilities can be badly lit and comprised of indistinct metallic structures, thus our system uses only LiDAR sensing and was developed to run on the quadruped robot's navigation PC. Point clouds are accumulated using an inertial-kinematic state estimator before being aligned using ICP registration. To close loops we use a loop proposal mechanism which matches individual segments between clouds. We trained a descriptor offline to match these segments. The efficiency of our method comes from carefully designing the network architecture to minimize the number of parameters such that this deep learning method can be deployed in real-time using only the CPU of a legged robot, a major contribution of this work. The set of odometry and loop closure factors are updated using pose graph optimization. Finally we present an efficient risk alignment prediction method which verifies the reliability of the registrations. Experimental results at an industrial facility demonstrated the robustness and flexibility of our system, including autonomous following paths derived from the SLAM map.