Alert button
Picture for Sören Schwertfeger

Sören Schwertfeger

Alert button

Robust Lifelong Indoor LiDAR Localization using the Area Graph

Aug 10, 2023
Fujing Xie, Sören Schwertfeger

Figure 1 for Robust Lifelong Indoor LiDAR Localization using the Area Graph
Figure 2 for Robust Lifelong Indoor LiDAR Localization using the Area Graph
Figure 3 for Robust Lifelong Indoor LiDAR Localization using the Area Graph
Figure 4 for Robust Lifelong Indoor LiDAR Localization using the Area Graph

Lifelong indoor localization in a given map is the basis for navigation of autonomous mobile robots. In this letter, we address the problem of robust localization in cluttered indoor environments like office spaces and corridors using 3D LiDAR point clouds in a given Area Graph, which is a hierarchical, topometric semantic map representation that uses polygons to demark areas such as rooms, corridors or buildings. This representation is very compact, can represent different floors of buildings through its hierarchy and provides semantic information that helps with localization, like poses of doors and glass. In contrast to this, commonly used map representations, such as occupancy grid maps or point clouds, lack these features and require frequent updates in response to environmental changes (e.g. moved furniture), unlike our approach, which matches against lifelong architectural features such as walls and doors. For that we apply filtering to remove clutter from the 3D input point cloud and then employ further scoring and weight functions for localization. Given a broad initial guess from WiFi localization, our experiments show that our global localization and the weighted point to line ICP pose tracking perform very well, even when compared to localization and SLAM algorithms that use the current, feature-rich cluttered map for localization.

Viaarxiv icon

Optimizing the extended Fourier Mellin Transformation Algorithm

Jul 19, 2023
Wenqing Jiang, Chengqian Li, Jinyue Cao, Sören Schwertfeger

Figure 1 for Optimizing the extended Fourier Mellin Transformation Algorithm
Figure 2 for Optimizing the extended Fourier Mellin Transformation Algorithm
Figure 3 for Optimizing the extended Fourier Mellin Transformation Algorithm
Figure 4 for Optimizing the extended Fourier Mellin Transformation Algorithm

With the increasing application of robots, stable and efficient Visual Odometry (VO) algorithms are becoming more and more important. Based on the Fourier Mellin Transformation (FMT) algorithm, the extended Fourier Mellin Transformation (eFMT) is an image registration approach that can be applied to downward-looking cameras, for example on aerial and underwater vehicles. eFMT extends FMT to multi-depth scenes and thus more application scenarios. It is a visual odometry method which estimates the pose transformation between three overlapping images. On this basis, we develop an optimized eFMT algorithm that improves certain aspects of the method and combines it with back-end optimization for the small loop of three consecutive frames. For this we investigate the extraction of uncertainty information from the eFMT registration, the related objective function and the graph-based optimization. Finally, we design a series of experiments to investigate the properties of this approach and compare it with other VO and SLAM (Simultaneous Localization and Mapping) algorithms. The results show the superior accuracy and speed of our o-eFMT approach, which is published as open source.

* 8 pages, 8 figures 
Viaarxiv icon

The SLAM Hive Benchmarking Suite

Mar 21, 2023
Yuanyuan Yang, Bowen Xu, Yinjie Li, Sören Schwertfeger

Figure 1 for The SLAM Hive Benchmarking Suite
Figure 2 for The SLAM Hive Benchmarking Suite
Figure 3 for The SLAM Hive Benchmarking Suite
Figure 4 for The SLAM Hive Benchmarking Suite

Benchmarking Simultaneous Localization and Mapping (SLAM) algorithms is important to scientists and users of robotic systems alike. But through their many configuration options in hardware and software, SLAM systems feature a vast parameter space that scientists up to now were not able to explore. The proposed SLAM Hive Benchmarking Suite is able to analyze SLAM algorithms in 1000's of mapping runs, through its utilization of container technology and deployment in a cluster. This paper presents the architecture and open source implementation of SLAM Hive and compares it to existing efforts on SLAM evaluation. Furthermore, we highlight the function of SLAM Hive by exploring some open source algorithms on public datasets in terms of accuracy. We compare the algorithms against each other and evaluate how parameters effect not only accuracy but also CPU and memory usage. Through this we show that SLAM Hive can become an essential tool for proper comparisons and evaluations of SLAM algorithms and thus drive the scientific development in the research on SLAM.

* 7 pages, 3 figures, IEEE Conference on Robotics and Automation (ICRA) 2023 
Viaarxiv icon

CP+: Camera Poses Augmentation with Large-scale LiDAR Maps

Feb 27, 2023
Jiadi Cui, Sören Schwertfeger

Figure 1 for CP+: Camera Poses Augmentation with Large-scale LiDAR Maps
Figure 2 for CP+: Camera Poses Augmentation with Large-scale LiDAR Maps
Figure 3 for CP+: Camera Poses Augmentation with Large-scale LiDAR Maps
Figure 4 for CP+: Camera Poses Augmentation with Large-scale LiDAR Maps

Large-scale colored point clouds have many advantages in navigation or scene display. Relying on cameras and LiDARs, which are now widely used in reconstruction tasks, it is possible to obtain such colored point clouds. However, the information from these two kinds of sensors is not well fused in many existing frameworks, resulting in poor colorization results, thus resulting in inaccurate camera poses and damaged point colorization results. We propose a novel framework called Camera Pose Augmentation (CP+) to improve the camera poses and align them directly with the LiDAR-based point cloud. Initial coarse camera poses are given by LiDAR-Inertial or LiDAR-Inertial-Visual Odometry with approximate extrinsic parameters and time synchronization. The key steps to improve the alignment of the images consist of selecting a point cloud corresponding to a region of interest in each camera view, extracting reliable edge features from this point cloud, and deriving 2D-3D line correspondences which are used towards iterative minimization of the re-projection error.

Viaarxiv icon

Cluster on Wheels

May 19, 2022
Yuanyuan Yang, Delin Feng, Sören Schwertfeger

Figure 1 for Cluster on Wheels
Figure 2 for Cluster on Wheels
Figure 3 for Cluster on Wheels
Figure 4 for Cluster on Wheels

This paper presents a very compact 16-node cluster that is the core of a future robot for collecting and storing massive amounts of sensor data for research on Simultaneous Localization and Mapping (SLAM). To the best of our knowledge, this is the first time that such a cluster is used in robotics. We first present the requirements and different options for computing of such a robot and then show the hardware and software of our solution in detail. The cluster consists of 16 nodes of AMD Ryzen 7 5700U CPUs with a total of 128 cores. As a system that is to be used on a Clearpath Husky robot, it is very small in size, can be operated from battery power and has all required power and networking components integrated. Stress tests on the completed cluster show that it performs well.

* 2022 International Conference for Advancement in Technology (ICONAT), 2022, pp. 1-8  
* 8 pages, 7 figures, 2022 International Conference for Advancement in Technology(ICONAT). It is about the work of the mapping robot cluster computer platform 
Viaarxiv icon

Hierarchical Topometric Representation of 3D Robotic Maps

Nov 24, 2021
Zhenpeng He, Hao Sun, Jiawei Hou, Yajun Ha, Sören Schwertfeger

Figure 1 for Hierarchical Topometric Representation of 3D Robotic Maps
Figure 2 for Hierarchical Topometric Representation of 3D Robotic Maps
Figure 3 for Hierarchical Topometric Representation of 3D Robotic Maps
Figure 4 for Hierarchical Topometric Representation of 3D Robotic Maps

In this paper, we propose a method for generating a hierarchical, volumetric topological map from 3D point clouds. There are three basic hierarchical levels in our map: $storey - region - volume$. The advantages of our method are reflected in both input and output. In terms of input, we accept multi-storey point clouds and building structures with sloping roofs or ceilings. In terms of output, we can generate results with metric information of different dimensionality, that are suitable for different robotics applications. The algorithm generates the volumetric representation by generating $volumes$ from a 3D voxel occupancy map. We then add $passage$s (connections between $volumes$), combine small $volumes$ into a big $region$ and use a 2D segmentation method for better topological representation. We evaluate our method on several freely available datasets. The experiments highlight the advantages of our approach.

* Autonomous Robots (2021): 1-17  
* Temporarily 
Viaarxiv icon

Video Contrastive Learning with Global Context

Aug 05, 2021
Haofei Kuang, Yi Zhu, Zhi Zhang, Xinyu Li, Joseph Tighe, Sören Schwertfeger, Cyrill Stachniss, Mu Li

Figure 1 for Video Contrastive Learning with Global Context
Figure 2 for Video Contrastive Learning with Global Context
Figure 3 for Video Contrastive Learning with Global Context
Figure 4 for Video Contrastive Learning with Global Context

Contrastive learning has revolutionized self-supervised image representation learning field, and recently been adapted to video domain. One of the greatest advantages of contrastive learning is that it allows us to flexibly define powerful loss objectives as long as we can find a reasonable way to formulate positive and negative samples to contrast. However, existing approaches rely heavily on the short-range spatiotemporal salience to form clip-level contrastive signals, thus limit themselves from using global context. In this paper, we propose a new video-level contrastive learning method based on segments to formulate positive pairs. Our formulation is able to capture global context in a video, thus robust to temporal content change. We also incorporate a temporal order regularization term to enforce the inherent sequential structure of videos. Extensive experiments show that our video-level contrastive learning framework (VCLR) is able to outperform previous state-of-the-arts on five video datasets for downstream action classification, action localization and video retrieval. Code is available at https://github.com/amazon-research/video-contrastive-learning.

* Code is publicly available at: https://github.com/amazon-research/video-contrastive-learning 
Viaarxiv icon

Improved Visual-Inertial Localization for Low-cost Rescue Robots

Nov 17, 2020
Xiaoling Long, Qingwen Xu, Yijun Yuan, Zhenpeng He, Sören Schwertfeger

Figure 1 for Improved Visual-Inertial Localization for Low-cost Rescue Robots
Figure 2 for Improved Visual-Inertial Localization for Low-cost Rescue Robots
Figure 3 for Improved Visual-Inertial Localization for Low-cost Rescue Robots
Figure 4 for Improved Visual-Inertial Localization for Low-cost Rescue Robots

This paper improves visual-inertial systems to boost the localization accuracy for low-cost rescue robots. When robots traverse on rugged terrain, the performance of pose estimation suffers from big noise on the measurements of the inertial sensors due to ground contact forces, especially for low-cost sensors. Therefore, we propose \textit{Threshold}-based and \textit{Dynamic Time Warping}-based methods to detect abnormal measurements and mitigate such faults. The two methods are embedded into the popular VINS-Mono system to evaluate their performance. Experiments are performed on simulation and real robot data, which show that both methods increase the pose estimation accuracy. Moreover, the \textit{Threshold}-based method performs better when the noise is small and the \textit{Dynamic Time Warping}-based one shows greater potential on large noise.

* accepted by IFAC World Congress 2020 
Viaarxiv icon