Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Danwei Wang

MIL-LC: A Robust Magnetometer-Inertial-LiDAR Fusion Multimodal Localization Framework

Jun 24, 2026

Qiyang Lyu, Zhenyu Wu, Wei Wang, Hongming Shen, Danwei Wang

Abstract:Localization in challenging environments, such as GNSS-denied, geometrically repetitive, or textureless scenes commonly found in offices, hotels, and underground parking facilities, remains an open problem for reliable autonomous mobile robot (AMR) deployment. Single-modality localization methods are inherently limited by the constraints of individual sensors. Although multimodal fusion frameworks have shown improved robustness, most existing approaches still rely heavily on geometric or texture features, or on infrastructure-based beacons, which increase installation and maintenance costs while reducing deployment flexibility. Recently, ambient magnetic field (AMF)-based localization has attracted growing attention because it does not depend on geometric or texture features, nor does it require additional infrastructure, making it a promising complementary modality for AMR localization. However, existing studies have only explored such fusion in pedestrian scenarios using smartphone-mounted sensor suites, and practical solutions for AMR systems remain largely unexplored. To address this gap, this article proposes a magnetometer-inertial-LiDAR fused multimodal localization framework with a custom-designed sensor suite, termed MIL-LC, which provides reliable localization even when LiDAR suffers from geometric degeneration or when the magnetic map changes during long-term deployment. Extensive experiments in both simulation and real-world environments demonstrate that the proposed MIL-LC framework achieves robust and accurate localization performance.

Via

Access Paper or Ask Questions

RoSLAC: Robust Simultaneous Localization and Calibration of Multiple Magnetometers

Apr 15, 2026

Qiyang Lyu, Zhenyu Wu, Wei Wang, Hongming Shen, Danwei Wang

Abstract:Localization of autonomous mobile robots (AMRs) in enclosed or semi-enclosed environments such as offices, hotels, hospitals, indoor parking facilities, and underground spaces where GPS signals are weak or unavailable remains a major obstacle to the deployment of fully autonomous systems. Infrastructure-based localization approaches, such as QR codes and RFID, are constrained by high installation and maintenance costs as well as limited flexibility, while onboard sensor-based methods, including LiDAR- and vision-based solutions, are affected by ambiguous geometric features and frequent occlusions caused by dynamic obstacles such as pedestrians. Ambient magnetic field (AMF)-based localization has therefore attracted growing interest in recent years because it does not rely on external infrastructure or geometric features, making it well-suited for AMR applications such as service robots and security robots. However, magnetometer measurements are often corrupted by distortions caused by ferromagnetic materials present on the sensor platform, which bias the AMF and degrade localization reliability. As a result, accurate magnetometer calibration to estimate distortion parameters becomes essential. Conventional calibration methods that rely on rotating the magnetometer are impractical for large and heavy platforms. To address this limitation, this paper proposes a robust simultaneous localization and calibration (RoSLAC) approach based on alternating optimization, which iteratively and efficiently estimates both the platform pose and magnetometer calibration parameters. Extensive evaluations conducted in high-fidelity simulation and real-world environments demonstrate that the proposed RoSLAC method achieves high localization accuracy while maintaining low computational cost compared with state-of-the-art magnetometer calibration techniques.

Via

Access Paper or Ask Questions

UniPR-3D: Towards Universal Visual Place Recognition with Visual Geometry Grounded Transformer

Dec 24, 2025

Tianchen Deng, Xun Chen, Ziming Li, Hongming Shen, Danwei Wang, Javier Civera, Hesheng Wang

Abstract:Visual Place Recognition (VPR) has been traditionally formulated as a single-image retrieval task. Using multiple views offers clear advantages, yet this setting remains relatively underexplored and existing methods often struggle to generalize across diverse environments. In this work we introduce UniPR-3D, the first VPR architecture that effectively integrates information from multiple views. UniPR-3D builds on a VGGT backbone capable of encoding multi-view 3D representations, which we adapt by designing feature aggregators and fine-tune for the place recognition task. To construct our descriptor, we jointly leverage the 3D tokens and intermediate 2D tokens produced by VGGT. Based on their distinct characteristics, we design dedicated aggregation modules for 2D and 3D features, allowing our descriptor to capture fine-grained texture cues while also reasoning across viewpoints. To further enhance generalization, we incorporate both single- and multi-frame aggregation schemes, along with a variable-length sequence retrieval strategy. Our experiments show that UniPR-3D sets a new state of the art, outperforming both single- and multi-view baselines and highlighting the effectiveness of geometry-grounded tokens for VPR. Our code and models will be made publicly available on Github https://github.com/dtc111111/UniPR-3D.

Via

Access Paper or Ask Questions

UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization

Jul 16, 2025

Hongming Shen, Xun Chen, Yulin Hui, Zhenyu Wu, Wei Wang, Qiyang Lyu, Tianchen Deng, Danwei Wang

Abstract:Existing LGL methods typically consider only partial information (e.g., geometric features) from LiDAR observations or are designed for homogeneous LiDAR sensors, overlooking the uniformity in LGL. In this work, a uniform LGL method is proposed, termed UniLGL, which simultaneously achieves spatial and material uniformity, as well as sensor-type uniformity. The key idea of the proposed method is to encode the complete point cloud, which contains both geometric and material information, into a pair of BEV images (i.e., a spatial BEV image and an intensity BEV image). An end-to-end multi-BEV fusion network is designed to extract uniform features, equipping UniLGL with spatial and material uniformity. To ensure robust LGL across heterogeneous LiDAR sensors, a viewpoint invariance hypothesis is introduced, which replaces the conventional translation equivariance assumption commonly used in existing LPR networks and supervises UniLGL to achieve sensor-type uniformity in both global descriptors and local feature representations. Finally, based on the mapping between local features on the 2D BEV image and the point cloud, a robust global pose estimator is derived that determines the global minimum of the global pose on SE(3) without requiring additional registration. To validate the effectiveness of the proposed uniform LGL, extensive benchmarks are conducted in real-world environments, and the results show that the proposed UniLGL is demonstratively competitive compared to other State-of-the-Art LGL methods. Furthermore, UniLGL has been deployed on diverse platforms, including full-size trucks and agile Micro Aerial Vehicles (MAVs), to enable high-precision localization and mapping as well as multi-MAV collaborative exploration in port and forest environments, demonstrating the applicability of UniLGL in industrial and field scenarios.

Via

Access Paper or Ask Questions

VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

May 25, 2025

Tianchen Deng, Wenhua Wu, Junjie He, Yue Pan, Xirui Jiang, Shenghai Yuan, Danwei Wang, Hesheng Wang, Weidong Chen

Figure 1 for VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

Figure 2 for VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

Figure 3 for VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

Figure 4 for VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes

Abstract:3D Gaussian Splatting has recently shown promising results in dense visual SLAM. However, existing 3DGS-based SLAM methods are all constrained to small-room scenarios and struggle with memory explosion in large-scale scenes and long sequences. To this end, we propose VPGS-SLAM, the first 3DGS-based large-scale RGBD SLAM framework for both indoor and outdoor scenarios. We design a novel voxel-based progressive 3D Gaussian mapping method with multiple submaps for compact and accurate scene representation in large-scale and long-sequence scenes. This allows us to scale up to arbitrary scenes and improves robustness (even under pose drifts). In addition, we propose a 2D-3D fusion camera tracking method to achieve robust and accurate camera tracking in both indoor and outdoor large-scale scenes. Furthermore, we design a 2D-3D Gaussian loop closure method to eliminate pose drift. We further propose a submap fusion method with online distillation to achieve global consistency in large-scale scenes when detecting a loop. Experiments on various indoor and outdoor datasets demonstrate the superiority and generalizability of the proposed framework. The code will be open source on https://github.com/dtc111111/vpgs-slam.

Via

Access Paper or Ask Questions

ABCDWaveNet: Advancing Robust Road Ponding Detection in Fog through Dynamic Frequency-Spatial Synergy

Apr 07, 2025

Ronghui Zhang, Dakang Lyu, Tengfei Li, Yunfan Wu, Ujjal Manandhar, Benfei Wang, Junzhou Chen, Bolin Gao, Danwei Wang, Yiqiu Tan

Abstract:Road ponding presents a significant threat to vehicle safety, particularly in adverse fog conditions, where reliable detection remains a persistent challenge for Advanced Driver Assistance Systems (ADAS). To address this, we propose ABCDWaveNet, a novel deep learning framework leveraging Dynamic Frequency-Spatial Synergy for robust ponding detection in fog. The core of ABCDWaveNet achieves this synergy by integrating dynamic convolution for adaptive feature extraction across varying visibilities with a wavelet-based module for synergistic frequency-spatial feature enhancement, significantly improving robustness against fog interference. Building on this foundation, ABCDWaveNet captures multi-scale structural and contextual information, subsequently employing an Adaptive Attention Coupling Gate (AACG) to adaptively fuse global and local features for enhanced accuracy. To facilitate realistic evaluations under combined adverse conditions, we introduce the Foggy Low-Light Puddle dataset. Extensive experiments demonstrate that ABCDWaveNet establishes new state-of-the-art performance, achieving significant Intersection over Union (IoU) gains of 3.51%, 1.75%, and 1.03% on the Foggy-Puddle, Puddle-1000, and our Foggy Low-Light Puddle datasets, respectively. Furthermore, its processing speed of 25.48 FPS on an NVIDIA Jetson AGX Orin confirms its suitability for ADAS deployment. These findings underscore the effectiveness of the proposed Dynamic Frequency-Spatial Synergy within ABCDWaveNet, offering valuable insights for developing proactive road safety solutions capable of operating reliably in challenging weather conditions.

Via

Access Paper or Ask Questions

IDF-MFL: Infrastructure-free and Drift-free Magnetic Field Localization for Mobile Robot

Nov 09, 2024

Hongming Shen, Zhenyu Wu, Wei Wang, Qiyang Lyu, Huiqin Zhou, Danwei Wang

Abstract:In recent years, infrastructure-based localization methods have achieved significant progress thanks to their reliable and drift-free localization capability. However, the pre-installed infrastructures suffer from inflexibilities and high maintenance costs. This poses an interesting problem of how to develop a drift-free localization system without using the pre-installed infrastructures. In this paper, an infrastructure-free and drift-free localization system is proposed using the ambient magnetic field (MF) information, namely IDF-MFL. IDF-MFL is infrastructure-free thanks to the high distinctiveness of the ambient MF information produced by inherent ferromagnetic objects in the environment, such as steel and reinforced concrete structures of buildings, and underground pipelines. The MF-based localization problem is defined as a stochastic optimization problem with the consideration of the non-Gaussian heavy-tailed noise introduced by MF measurement outliers (caused by dynamic ferromagnetic objects), and an outlier-robust state estimation algorithm is derived to find the optimal distribution of robot state that makes the expectation of MF matching cost achieves its lower bound. The proposed method is evaluated in multiple scenarios, including experiments on high-fidelity simulation, and real-world environments. The results demonstrate that the proposed method can achieve high-accuracy, reliable, and real-time localization without any pre-installed infrastructures.

Via

Access Paper or Ask Questions

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder

Oct 10, 2024

Junzhou Chen, Xuan Wen, Ronghui Zhang, Bingtao Ren, Di Wu, Zhigang Xu, Danwei Wang

Abstract:Unsupervised Domain Adaptation (UDA) aims to adapt a model trained on a labeled source domain to an unlabeled target domain by addressing the domain shift. Existing Unsupervised Domain Adaptation (UDA) methods often fall short in fully leveraging contextual information from the target domain, leading to suboptimal decision boundary separation during source and target domain alignment. To address this, we introduce GrabDAE, an innovative UDA framework designed to tackle domain shift in visual classification tasks. GrabDAE incorporates two key innovations: the Grab-Mask module, which blurs background information in target domain images, enabling the model to focus on essential, domain-relevant features through contrastive learning; and the Denoising Auto-Encoder (DAE), which enhances feature alignment by reconstructing features and filtering noise, ensuring a more robust adaptation to the target domain. These components empower GrabDAE to effectively handle unlabeled target domain data, significantly improving both classification accuracy and robustness. Extensive experiments on benchmark datasets, including VisDA-2017, Office-Home, and Office31, demonstrate that GrabDAE consistently surpasses state-of-the-art UDA methods, setting new performance benchmarks. By tackling UDA's critical challenges with its novel feature masking and denoising approach, GrabDAE offers both significant theoretical and practical advancements in domain adaptation.

Via

Access Paper or Ask Questions

CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Aug 09, 2024

Hongming Shen, Zhenyu Wu, Wei Wang, Qiyang Lyu, Huiqin Zhou, Tianchen Deng, Yeqing Zhu, Danwei Wang

Figure 1 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 2 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 3 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Figure 4 for CTE-MLO: Continuous-time and Efficient Multi-LiDAR Odometry with Localizability-aware Point Cloud Sampling

Abstract:In recent years, LiDAR-based localization and mapping methods have achieved significant progress thanks to their reliable and real-time localization capability. Considering single LiDAR odometry often faces hardware failures and degradation in practical scenarios, Multi-LiDAR Odometry (MLO), as an emerging technology, is studied to enhance the performance of LiDAR-based localization and mapping systems. However, MLO can suffer from high computational complexity introduced by dense point clouds that are fused from multiple LiDARs, and the continuous-time measurement characteristic is constantly neglected by existing LiDAR odometry. This motivates us to develop a Continuous-Time and Efficient MLO, namely CTE-MLO, which can achieve accurate and real-time state estimation using multi-LiDAR measurements through a continuous-time perspective. In this paper, the Gaussian process estimation is naturally combined with the Kalman filter, which enables each LiDAR point in a point stream to query the corresponding continuous-time trajectory within its time instants. A decentralized multi-LiDAR synchronization scheme also be devised to combine points from separate LiDARs into a single point cloud without the requirement for primary LiDAR assignment. Moreover, with the aim of improving the real-time performance of MLO without sacrificing robustness, a point cloud sampling strategy is designed with the consideration of localizability. The effectiveness of the proposed method is demonstrated through various scenarios, including public datasets and real-world autonomous driving experiments. The results demonstrate that the proposed CTE-MLO can achieve high-accuracy continuous-time state estimations in real-time and is demonstratively competitive compared to other state-of-the-art methods.

Via

Access Paper or Ask Questions

Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes

Apr 09, 2024

Tianchen Deng, Nailin Wang, Chongdi Wang, Shenghai Yuan, Jingchuan Wang, Danwei Wang, Weidong Chen

Abstract:Dense scene reconstruction for photo-realistic view synthesis has various applications, such as VR/AR, autonomous vehicles. However, most existing methods have difficulties in large-scale scenes due to three core challenges: \textit{(a) inaccurate depth input.} Accurate depth input is impossible to get in real-world large-scale scenes. \textit{(b) inaccurate pose estimation.} Most existing approaches rely on accurate pre-estimated camera poses. \textit{(c) insufficient scene representation capability.} A single global radiance field lacks the capacity to effectively scale to large-scale scenes. To this end, we propose an incremental joint learning framework, which can achieve accurate depth, pose estimation, and large-scale scene reconstruction. A vision transformer-based network is adopted as the backbone to enhance performance in scale information estimation. For pose estimation, a feature-metric bundle adjustment (FBA) method is designed for accurate and robust camera tracking in large-scale scenes. In terms of implicit scene representation, we propose an incremental scene representation method to construct the entire large-scale scene as multiple local radiance fields to enhance the scalability of 3D scene representation. Extended experiments have been conducted to demonstrate the effectiveness and accuracy of our method in depth estimation, pose estimation, and large-scale scene reconstruction.

Via

Access Paper or Ask Questions