Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patric Jensfelt

Quantifying Epistemic Uncertainty in Absolute Pose Regression

Apr 09, 2025

Fereidoon Zangeneh, Amit Dekel, Alessandro Pieropan, Patric Jensfelt

Abstract:Visual relocalization is the task of estimating the camera pose given an image it views. Absolute pose regression offers a solution to this task by training a neural network, directly regressing the camera pose from image features. While an attractive solution in terms of memory and compute efficiency, absolute pose regression's predictions are inaccurate and unreliable outside the training domain. In this work, we propose a novel method for quantifying the epistemic uncertainty of an absolute pose regression model by estimating the likelihood of observations within a variational framework. Beyond providing a measure of confidence in predictions, our approach offers a unified model that also handles observation ambiguities, probabilistically localizing the camera in the presence of repetitive structures. Our method outperforms existing approaches in capturing the relation between uncertainty and prediction error.

Via

Access Paper or Ask Questions

SSF: Sparse Long-Range Scene Flow for Autonomous Driving

Jan 29, 2025

Ajinkya Khoche, Qingwen Zhang, Laura Pereira Sanchez, Aron Asefaw, Sina Sharif Mansouri, Patric Jensfelt

Figure 1 for SSF: Sparse Long-Range Scene Flow for Autonomous Driving

Figure 2 for SSF: Sparse Long-Range Scene Flow for Autonomous Driving

Figure 3 for SSF: Sparse Long-Range Scene Flow for Autonomous Driving

Figure 4 for SSF: Sparse Long-Range Scene Flow for Autonomous Driving

Abstract:Scene flow enables an understanding of the motion characteristics of the environment in the 3D world. It gains particular significance in the long-range, where object-based perception methods might fail due to sparse observations far away. Although significant advancements have been made in scene flow pipelines to handle large-scale point clouds, a gap remains in scalability with respect to long-range. We attribute this limitation to the common design choice of using dense feature grids, which scale quadratically with range. In this paper, we propose Sparse Scene Flow (SSF), a general pipeline for long-range scene flow, adopting a sparse convolution based backbone for feature extraction. This approach introduces a new challenge: a mismatch in size and ordering of sparse feature maps between time-sequential point scans. To address this, we propose a sparse feature fusion scheme, that augments the feature maps with virtual voxels at missing locations. Additionally, we propose a range-wise metric that implicitly gives greater importance to faraway points. Our method, SSF, achieves state-of-the-art results on the Argoverse2 dataset, demonstrating strong performance in long-range scene flow estimation. Our code will be released at https://github.com/KTH-RPL/SSF.git.

* 7 pages, 3 figures, accepted to International Conference on Robotics and Automation (ICRA) 2025

Via

Access Paper or Ask Questions

Conditional Variational Autoencoders for Probabilistic Pose Regression

Oct 07, 2024

Fereidoon Zangeneh, Leonard Bruns, Amit Dekel, Alessandro Pieropan, Patric Jensfelt

Abstract:Robots rely on visual relocalization to estimate their pose from camera images when they lose track. One of the challenges in visual relocalization is repetitive structures in the operation environment of the robot. This calls for probabilistic methods that support multiple hypotheses for robot's pose. We propose such a probabilistic method to predict the posterior distribution of camera poses given an observed image. Our proposed training strategy results in a generative model of camera poses given an image, which can be used to draw samples from the pose posterior distribution. Our method is streamlined and well-founded in theory and outperforms existing methods on localization in presence of ambiguities.

* Accepted at IROS 2024

Via

Access Paper or Ask Questions

Fusion in Context: A Multimodal Approach to Affective State Recognition

Sep 18, 2024

Youssef Mohamed, Severin Lemaignan, Arzu Guneysu, Patric Jensfelt, Christian Smith

Figure 1 for Fusion in Context: A Multimodal Approach to Affective State Recognition

Figure 2 for Fusion in Context: A Multimodal Approach to Affective State Recognition

Figure 3 for Fusion in Context: A Multimodal Approach to Affective State Recognition

Figure 4 for Fusion in Context: A Multimodal Approach to Affective State Recognition

Abstract:Accurate recognition of human emotions is a crucial challenge in affective computing and human-robot interaction (HRI). Emotional states play a vital role in shaping behaviors, decisions, and social interactions. However, emotional expressions can be influenced by contextual factors, leading to misinterpretations if context is not considered. Multimodal fusion, combining modalities like facial expressions, speech, and physiological signals, has shown promise in improving affect recognition. This paper proposes a transformer-based multimodal fusion approach that leverages facial thermal data, facial action units, and textual context information for context-aware emotion recognition. We explore modality-specific encoders to learn tailored representations, which are then fused using additive fusion and processed by a shared transformer encoder to capture temporal dependencies and interactions. The proposed method is evaluated on a dataset collected from participants engaged in a tangible tabletop Pacman game designed to induce various affective states. Our results demonstrate the effectiveness of incorporating contextual information and multimodal fusion for affective state recognition.

Via

Access Paper or Ask Questions

ExelMap: Explainable Element-based HD-Map Change Detection and Update

Sep 16, 2024

Lena Wild, Ludvig Ericson, Rafael Valencia, Patric Jensfelt

Figure 1 for ExelMap: Explainable Element-based HD-Map Change Detection and Update

Figure 2 for ExelMap: Explainable Element-based HD-Map Change Detection and Update

Figure 3 for ExelMap: Explainable Element-based HD-Map Change Detection and Update

Figure 4 for ExelMap: Explainable Element-based HD-Map Change Detection and Update

Abstract:Acquisition and maintenance are central problems in deploying high-definition (HD) maps for autonomous driving, with two lines of research prevalent in current literature: Online HD map generation and HD map change detection. However, the generated map's quality is currently insufficient for safe deployment, and many change detection approaches fail to precisely localize and extract the changed map elements, hence lacking explainability and hindering a potential fleet-based cooperative HD map update. In this paper, we propose the novel task of explainable element-based HD map change detection and update. In extending recent approaches that use online mapping techniques informed with an outdated map prior for HD map updating, we present ExelMap, an explainable element-based map updating strategy that specifically identifies changed map elements. In this context, we discuss how currently used metrics fail to capture change detection performance, while allowing for unfair comparison between prior-less and prior-informed map generation methods. Finally, we present an experimental study on real-world changes related to pedestrian crossings of the Argoverse 2 Map Change Dataset. To the best of our knowledge, this is the first comprehensive problem investigation of real-world end-to-end element-based HD map change detection and update, and ExelMap the first proposed solution.

* 17 pages, 3 figures

Via

Access Paper or Ask Questions

HD-maps as Prior Information for Globally Consistent Mapping in GPS-denied Environments

Jul 28, 2024

Waqas Ali, Patric Jensfelt, Thien-Minh Nguyen

Abstract:In recent years, prior maps have become a mainstream tool in autonomous navigation. However, commonly available prior maps are still tailored to control-and-decision tasks, and the use of these maps for localization remains largely unexplored. To bridge this gap, we propose a lidar-based localization and mapping (LOAM) system that can exploit the common HD-maps in autonomous driving scenarios. Specifically, we propose a technique to extract information from the drivable area and ground surface height components of the HD-maps to construct 4DOF pose priors. These pose priors are then further integrated into the pose-graph optimization problem to create a globally consistent 3D map. Experiments show that our scheme can significantly improve the global consistency of the map compared to state-of-the-art lidar-only approaches, proven to be a useful technology to enhance the system's robustness, especially in GPS-denied environment. Moreover, our work also serves as a first step towards long-term navigation of robots in familiar environment, by updating a map. In autonomous driving this could enable updating the HD-maps without sourcing a new from a third party company, which is expensive and introduces delays from change in the world to updated map.

Via

Access Paper or Ask Questions

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Jul 01, 2024

Qingwen Zhang, Yi Yang, Peizheng Li, Olov Andersson, Patric Jensfelt

Figure 1 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 2 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 3 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 4 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Abstract:Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow along with trained model weights.

* 25 pages (14 main pages + 11 supp materail), 5 figures

Via

Access Paper or Ask Questions

Beyond the Frontier: Predicting Unseen Walls from Occupancy Grids by Learning from Floor Plans

Jun 13, 2024

Ludvig Ericson, Patric Jensfelt

Abstract:In this paper, we tackle the challenge of predicting the unseen walls of a partially observed environment as a set of 2D line segments, conditioned on occupancy grids integrated along the trajectory of a 360{\deg} LIDAR sensor. A dataset of such occupancy grids and their corresponding target wall segments is collected by navigating a virtual robot between a set of randomly sampled waypoints in a collection of office-scale floor plans from a university campus. The line segment prediction task is formulated as an autoregressive sequence prediction task, and an attention-based deep network is trained on the dataset. The sequence-based autoregressive formulation is evaluated through predicted information gain, as in frontier-based autonomous exploration, demonstrating significant improvements over both non-predictive estimation and convolution-based image prediction found in the literature. Ablations on key components are evaluated, as well as sensor range and the occupancy grid's metric area. Finally, model generality is validated by predicting walls in a novel floor plan reconstructed on-the-fly in a real-world office environment.

* IEEE Robotics and Automation Letters (2024) pp. 2377-3766
* RA-L, 8 pages

Via

Access Paper or Ask Questions

BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

May 12, 2024

Mingkai Jia, Qingwen Zhang, Bowen Yang, Jin Wu, Ming Liu, Patric Jensfelt

Figure 1 for BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Figure 2 for BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Figure 3 for BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Figure 4 for BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Abstract:Global point clouds that correctly represent the static environment features can facilitate accurate localization and robust path planning. However, dynamic objects introduce undesired ghost tracks that are mixed up with the static environment. Existing dynamic removal methods normally fail to balance the performance in computational efficiency and accuracy. In response, we present BeautyMap to efficiently remove the dynamic points while retaining static features for high-fidelity global maps. Our approach utilizes a binary-encoded matrix to efficiently extract the environment features. With a bit-wise comparison between matrices of each frame and the corresponding map region, we can extract potential dynamic regions. Then we use coarse to fine hierarchical segmentation of the $z$-axis to handle terrain variations. The final static restoration module accounts for the range-visibility of each single scan and protects static points out of sight. Comparative experiments underscore BeautyMap's superior performance in both accuracy and efficiency against other dynamic points removal methods. The code is open-sourced at https://github.com/MKJia/BeautyMap.

* The first two authors are co-first authors. 8 pages, accepted by RA-L

Via

Access Paper or Ask Questions

Neural Graph Mapping for Dense SLAM with Efficient Loop Closure

May 06, 2024

Leonard Bruns, Jun Zhang, Patric Jensfelt

Abstract:Existing neural field-based SLAM methods typically employ a single monolithic field as their scene representation. This prevents efficient incorporation of loop closure constraints and limits scalability. To address these shortcomings, we propose a neural mapping framework which anchors lightweight neural fields to the pose graph of a sparse visual SLAM system. Our approach shows the ability to integrate large-scale loop closures, while limiting necessary reintegration. Furthermore, we verify the scalability of our approach by demonstrating successful building-scale mapping taking multiple loop closures into account during the optimization, and show that our method outperforms existing state-of-the-art approaches on large scenes in terms of quality and runtime. Our code is available at https://kth-rpl.github.io/neural_graph_mapping/.

* Project page: https://kth-rpl.github.io/neural_graph_mapping/

Via

Access Paper or Ask Questions