Abstract:LiDAR Place Recognition (LPR) is a key component in robotic localization, enabling robots to align current scans with prior maps of their environment. While Visual Place Recognition (VPR) has embraced Vision Foundation Models (VFMs) to enhance descriptor robustness, LPR has relied on task-specific models with limited use of pre-trained foundation-level knowledge. This is due to the lack of 3D foundation models and the challenges of using VFM with LiDAR point clouds. To tackle this, we introduce ImLPR, a novel pipeline that employs a pre-trained DINOv2 VFM to generate rich descriptors for LPR. To our knowledge, ImLPR is the first method to leverage a VFM to support LPR. ImLPR converts raw point clouds into Range Image Views (RIV) to leverage VFM in the LiDAR domain. It employs MultiConv adapters and Patch-InfoNCE loss for effective feature learning. We validate ImLPR using public datasets where it outperforms state-of-the-art (SOTA) methods in intra-session and inter-session LPR with top Recall@1 and F1 scores across various LiDARs. We also demonstrate that RIV outperforms Bird's-Eye-View (BEV) as a representation choice for adapting LiDAR for VFM. We release ImLPR as open source for the robotics community.
Abstract:Large-scale construction and demolition significantly challenge long-term place recognition (PR) by drastically reshaping urban and suburban environments. Existing datasets predominantly reflect limited or indoor-focused changes, failing to adequately represent extensive outdoor transformations. To bridge this gap, we introduce the City that Never Settles (CNS) dataset, a simulation-based dataset created using the CARLA simulator, capturing major structural changes-such as building construction and demolition-across diverse maps and sequences. Additionally, we propose TCR_sym, a symmetric version of the original TCR metric, enabling consistent measurement of structural changes irrespective of source-target ordering. Quantitative comparisons demonstrate that CNS encompasses more extensive transformations than current real-world benchmarks. Evaluations of state-of-the-art LiDAR-based PR methods on CNS reveal substantial performance degradation, underscoring the need for robust algorithms capable of handling significant environmental changes. Our dataset is available at https://github.com/Hyunho111/CNS_dataset.
Abstract:Recently, gravity has been highlighted as a crucial constraint for state estimation to alleviate potential vertical drift. Existing online gravity estimation methods rely on pose estimation combined with IMU measurements, which is considered best practice when direct velocity measurements are unavailable. However, with radar sensors providing direct velocity data-a measurement not yet utilized for gravity estimation-we found a significant opportunity to improve gravity estimation accuracy substantially. GaRLIO, the proposed gravity-enhanced Radar-LiDAR-Inertial Odometry, can robustly predict gravity to reduce vertical drift while simultaneously enhancing state estimation performance using pointwise velocity measurements. Furthermore, GaRLIO ensures robustness in dynamic environments by utilizing radar to remove dynamic objects from LiDAR point clouds. Our method is validated through experiments in various environments prone to vertical drift, demonstrating superior performance compared to traditional LiDAR-Inertial Odometry methods. We make our source code publicly available to encourage further research and development. https://github.com/ChiyunNoh/GaRLIO
Abstract:Recently, radars have been widely featured in robotics for their robustness in challenging weather conditions. Two commonly used radar types are spinning radars and phased-array radars, each offering distinct sensor characteristics. Existing datasets typically feature only a single type of radar, leading to the development of algorithms limited to that specific kind. In this work, we highlight that combining different radar types offers complementary advantages, which can be leveraged through a heterogeneous radar dataset. Moreover, this new dataset fosters research in multi-session and multi-robot scenarios where robots are equipped with different types of radars. In this context, we introduce the HeRCULES dataset, a comprehensive, multi-modal dataset with heterogeneous radars, FMCW LiDAR, IMU, GPS, and cameras. This is the first dataset to integrate 4D radar and spinning radar alongside FMCW LiDAR, offering unparalleled localization, mapping, and place recognition capabilities. The dataset covers diverse weather and lighting conditions and a range of urban traffic scenarios, enabling a comprehensive analysis across various environments. The sequence paths with multiple revisits and ground truth pose for each sensor enhance its suitability for place recognition research. We expect the HeRCULES dataset to facilitate odometry, mapping, place recognition, and sensor fusion research. The dataset and development tools are available at https://sites.google.com/view/herculesdataset.
Abstract:LiDAR place recognition is a crucial module in localization that matches the current location with previously observed environments. Most existing approaches in LiDAR place recognition dominantly focus on the spinning type LiDAR to exploit its large FOV for matching. However, with the recent emergence of various LiDAR types, the importance of matching data across different LiDAR types has grown significantly-a challenge that has been largely overlooked for many years. To address these challenges, we introduce HeLiOS, a deep network tailored for heterogeneous LiDAR place recognition, which utilizes small local windows with spherical transformers and optimal transport-based cluster assignment for robust global descriptors. Our overlap-based data mining and guided-triplet loss overcome the limitations of traditional distance-based mining and discrete class constraints. HeLiOS is validated on public datasets, demonstrating performance in heterogeneous LiDAR place recognition while including an evaluation for long-term recognition, showcasing its ability to handle unseen LiDAR types. We release the HeLiOS code as an open source for the robotics community at https://github.com/minwoo0611/HeLiOS.
Abstract:Maritime environmental sensing requires overcoming challenges from complex conditions such as harsh weather, platform perturbations, large dynamic objects, and the requirement for long detection ranges. While cameras and LiDAR are commonly used in ground vehicle navigation, their applicability in maritime settings is limited by range constraints and hardware maintenance issues. Radar sensors, however, offer robust long-range detection capabilities and resilience to physical contamination from weather and saline conditions, making it a powerful sensor for maritime navigation. Among various radar types, X-band radar (e.g., marine radar) is widely employed for maritime vessel navigation, providing effective long-range detection essential for situational awareness and collision avoidance. Nevertheless, it exhibits limitations during berthing operations where close-range object detection is critical. To address this shortcoming, we incorporate W-band radar (e.g., Navtech imaging radar), which excels in detecting nearby objects with a higher update rate. We present a comprehensive maritime sensor dataset featuring multi-range detection capabilities. This dataset integrates short-range LiDAR data, medium-range W-band radar data, and long-range X-band radar data into a unified framework. Additionally, it includes object labels for oceanic object detection usage, derived from radar and stereo camera images. The dataset comprises seven sequences collected from diverse regions with varying levels of estimation difficulty, ranging from easy to challenging, and includes common locations suitable for global localization tasks. This dataset serves as a valuable resource for advancing research in place recognition, odometry estimation, SLAM, object detection, and dynamic object elimination within maritime environments. Dataset can be found in following link: https://sites.google.com/view/rpmmoana
Abstract:Accuracy evaluation of a 3D pointcloud map is crucial for the development of autonomous driving systems. In this work, we propose a user-independent software/hardware system that can quantitatively evaluate the accuracy of a 3D pointcloud map acquired from LiDAR(-Inertial) SLAM. We introduce a LiDAR target that functions robustly in the outdoor environment, while remaining observable by LiDAR. We also propose a software algorithm that automatically extracts representative points and calculates the accuracy of the 3D pointcloud map by leveraging GPS position data. This methodology overcomes the limitations of the manual selection method, that its result varies between users. Furthermore, two different error metrics, relative and absolute errors, are introduced to analyze the accuracy from different perspectives. Our implementations are available at: https://github.com/SangwooJung98/3D_Map_Evaluation
Abstract:Place recognition, an essential challenge in computer vision and robotics, involves identifying previously visited locations. Despite algorithmic progress, challenges related to appearance change persist, with existing datasets often focusing on seasonal and weather variations but overlooking terrain changes. Understanding terrain alterations becomes critical for effective place recognition, given the aging infrastructure and ongoing city repairs. For real-world applicability, the comprehensive evaluation of algorithms must consider spatial dynamics. To address existing limitations, we present a novel multi-session place recognition dataset acquired from an active construction site. Our dataset captures ongoing construction progress through multiple data collections, facilitating evaluation in dynamic environments. It includes camera images, LiDAR point cloud data, and IMU data, enabling visual and LiDAR-based place recognition techniques, and supporting sensor fusion. Additionally, we provide ground truth information for range-based place recognition evaluation. Our dataset aims to advance place recognition algorithms in challenging and dynamic settings. Our dataset is available at https://github.com/dongjae0107/ConPR.
Abstract:Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. 3D object detection models usually perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. While conventional perception algorithms typically employ a single threshold in post-processing, the proposed algorithm addresses this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in urban scenarios. The results show performance enhancements in 3D object detection models across a range of scenarios, not only in dynamic urban road conditions but also in scenarios involving adverse weather conditions.
Abstract:The integration of sensor data is crucial in the field of robotics to take full advantage of the various sensors employed. One critical aspect of this integration is determining the extrinsic calibration parameters, such as the relative transformation, between each sensor. The use of data fusion between complementary sensors, such as radar and LiDAR, can provide significant benefits, particularly in harsh environments where accurate depth data is required. However, noise included in radar sensor data can make the estimation of extrinsic calibration challenging. To address this issue, we present a novel framework for the extrinsic calibration of radar and LiDAR sensors, utilizing CycleGAN as amethod of image-to-image translation. Our proposed method employs translating radar bird-eye-view images into LiDAR-style images to estimate the 3-DOF extrinsic parameters. The use of image registration techniques, as well as deskewing based on sensor odometry and B-spline interpolation, is employed to address the rolling shutter effect commonly present in spinning sensors. Our method demonstrates a notable improvement in extrinsic calibration compared to filter-based methods using the MulRan dataset.