Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tobias Fischer

Quantile Transfer for Reliable Operating Point Selection in Visual Place Recognition

Feb 04, 2026

Dhyey Manish Rajani, Michael Milford, Tobias Fischer

Abstract:Visual Place Recognition (VPR) is a key component for localisation in GNSS-denied environments, but its performance critically depends on selecting an image matching threshold (operating point) that balances precision and recall. Thresholds are typically hand-tuned offline for a specific environment and fixed during deployment, leading to degraded performance under environmental change. We propose a method that, given a user-defined precision requirement, automatically selects the operating point of a VPR system to maximise recall. The method uses a small calibration traversal with known correspondences and transfers thresholds to deployment via quantile normalisation of similarity score distributions. This quantile transfer ensures that thresholds remain stable across calibration sizes and query subsets, making the method robust to sampling variability. Experiments with multiple state-of-the-art VPR techniques and datasets show that the proposed approach consistently outperforms the state-of-the-art, delivering up to 25% higher recall in high-precision operating regimes. The method eliminates manual tuning by adapting to new environments and generalising across operating conditions. Our code will be released upon acceptance.

Via

Access Paper or Ask Questions

Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors

Dec 19, 2025

Son Tung Nguyen, Tobias Fischer, Alejandro Fontan, Michael Milford

Figure 1 for Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors

Figure 2 for Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors

Figure 3 for Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors

Figure 4 for Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors

Abstract:Recent learning-based visual localization methods use global descriptors to disambiguate visually similar places, but existing approaches often derive these descriptors from geometric cues alone (e.g., covisibility graphs), limiting their discriminative power and reducing robustness in the presence of noisy geometric constraints. We propose an aggregator module that learns global descriptors consistent with both geometrical structure and visual similarity, ensuring that images are close in descriptor space only when they are visually similar and spatially connected. This corrects erroneous associations caused by unreliable overlap scores. Using a batch-mining strategy based solely on the overlap scores and a modified contrastive loss, our method trains without manual place labels and generalizes across diverse environments. Experiments on challenging benchmarks show substantial localization gains in large-scale environments while preserving computational and memory efficiency. Code is available at \href{https://github.com/sontung/robust\_scr}{github.com/sontung/robust\_scr}.

* WACV 2026 conference paper

Via

Access Paper or Ask Questions

ReMoSPLAT: Reactive Mobile Manipulation Control on a Gaussian Splat

Dec 10, 2025

Nicolas Marticorena, Tobias Fischer, Niko Suenderhauf

Figure 1 for ReMoSPLAT: Reactive Mobile Manipulation Control on a Gaussian Splat

Figure 2 for ReMoSPLAT: Reactive Mobile Manipulation Control on a Gaussian Splat

Figure 3 for ReMoSPLAT: Reactive Mobile Manipulation Control on a Gaussian Splat

Figure 4 for ReMoSPLAT: Reactive Mobile Manipulation Control on a Gaussian Splat

Abstract:Reactive control can gracefully coordinate the motion of the base and the arm of a mobile manipulator. However, incorporating an accurate representation of the environment to avoid obstacles without involving costly planning remains a challenge. In this work, we present ReMoSPLAT, a reactive controller based on a quadratic program formulation for mobile manipulation that leverages a Gaussian Splat representation for collision avoidance. By integrating additional constraints and costs into the optimisation formulation, a mobile manipulator platform can reach its intended end effector pose while avoiding obstacles, even in cluttered scenes. We investigate the trade-offs of two methods for efficiently calculating robot-obstacle distances, comparing a purely geometric approach with a rasterisation-based approach. Our experiments in simulation on both synthetic and real-world scans demonstrate the feasibility of our method, showing that the proposed approach achieves performance comparable to controllers that rely on perfect ground-truth information.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

Going Places: Place Recognition in Artificial and Natural Systems

Nov 18, 2025

Michael Milford, Tobias Fischer

Figure 1 for Going Places: Place Recognition in Artificial and Natural Systems

Figure 2 for Going Places: Place Recognition in Artificial and Natural Systems

Figure 3 for Going Places: Place Recognition in Artificial and Natural Systems

Figure 4 for Going Places: Place Recognition in Artificial and Natural Systems

Abstract:Place recognition, the ability to identify previously visited locations, is critical for both biological navigation and autonomous systems. This review synthesizes findings from robotic systems, animal studies, and human research to explore how different systems encode and recall place. We examine the computational and representational strategies employed across artificial systems, animals, and humans, highlighting convergent solutions such as topological mapping, cue integration, and memory management. Animal systems reveal evolved mechanisms for multimodal navigation and environmental adaptation, while human studies provide unique insights into semantic place concepts, cultural influences, and introspective capabilities. Artificial systems showcase scalable architectures and data-driven models. We propose a unifying set of concepts by which to consider and develop place recognition mechanisms and identify key challenges such as generalization, robustness, and environmental variability. This review aims to foster innovations in artificial localization by connecting future developments in artificial place recognition systems to insights from both animal navigation research and human spatial cognition studies.

* Annual Review of Control, Robotics, and Autonomous Systems 2026, vol. 9

Via

Access Paper or Ask Questions

Pixi: Unified Software Development and Distribution for Robotics and AI

Nov 06, 2025

Tobias Fischer, Wolf Vollprecht, Bas Zalmstra, Ruben Arts, Tim de Jager, Alejandro Fontan, Adam D Hines, Michael Milford, Silvio Traversaro, Daniel Claes(+1 more)

Abstract:The reproducibility crisis in scientific computing constrains robotics research. Existing studies reveal that up to 70% of robotics algorithms cannot be reproduced by independent teams, while many others fail to reach deployment because creating shareable software environments remains prohibitively complex. These challenges stem from fragmented, multi-language, and hardware-software toolchains that lead to dependency hell. We present Pixi, a unified package-management framework that addresses these issues by capturing exact dependency states in project-level lockfiles, ensuring bit-for-bit reproducibility across platforms. Its high-performance SAT solver achieves up to 10x faster dependency resolution than comparable tools, while integration of the conda-forge and PyPI ecosystems removes the need for multiple managers. Adopted in over 5,300 projects since 2023, Pixi reduces setup times from hours to minutes and lowers technical barriers for researchers worldwide. By enabling scalable, reproducible, collaborative research infrastructure, Pixi accelerates progress in robotics and AI.

* 20 pages, 3 figures, 11 code snippets

Via

Access Paper or Ask Questions

Event-LAB: Towards Standardized Evaluation of Neuromorphic Localization Methods

Sep 18, 2025

Adam D. Hines, Alejandro Fontan, Michael Milford, Tobias Fischer

Abstract:Event-based localization research and datasets are a rapidly growing area of interest, with a tenfold increase in the cumulative total number of published papers on this topic over the past 10 years. Whilst the rapid expansion in the field is exciting, it brings with it an associated challenge: a growth in the variety of required code and package dependencies as well as data formats, making comparisons difficult and cumbersome for researchers to implement reliably. To address this challenge, we present Event-LAB: a new and unified framework for running several event-based localization methodologies across multiple datasets. Event-LAB is implemented using the Pixi package and dependency manager, that enables a single command-line installation and invocation for combinations of localization methods and datasets. To demonstrate the capabilities of the framework, we implement two common event-based localization pipelines: Visual Place Recognition (VPR) and Simultaneous Localization and Mapping (SLAM). We demonstrate the ability of the framework to systematically visualize and analyze the results of multiple methods and datasets, revealing key insights such as the association of parameters that control event collection counts and window sizes for frame generation to large variations in performance. The results and analysis demonstrate the importance of fairly comparing methodologies with consistent event image generation parameters. Our Event-LAB framework provides this ability for the research community, by contributing a streamlined workflow for easily setting up multiple conditions.

* 8 pages, 6 figures, under review

Via

Access Paper or Ask Questions

Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

Aug 26, 2025

Melanie Wille, Tobias Fischer, Scarlett Raine

Figure 1 for Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

Figure 2 for Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

Figure 3 for Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

Figure 4 for Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

Abstract:Underwater object detection is critical for monitoring marine ecosystems but poses unique challenges, including degraded image quality, imbalanced class distribution, and distinct visual characteristics. Not every species is detected equally well, yet underlying causes remain unclear. We address two key research questions: 1) What factors beyond data quantity drive class-specific performance disparities? 2) How can we systematically improve detection of under-performing marine species? We manipulate the DUO dataset to separate the object detection task into localization and classification and investigate the under-performance of the scallop class. Localization analysis using YOLO11 and TIDE finds that foreground-background discrimination is the most problematic stage regardless of data quantity. Classification experiments reveal persistent precision gaps even with balanced data, indicating intrinsic feature-based challenges beyond data scarcity and inter-class dependencies. We recommend imbalanced distributions when prioritizing precision, and balanced distributions when prioritizing recall. Improving under-performing classes should focus on algorithmic advances, especially within localization modules. We publicly release our code and datasets.

* 10 pages

Via

Access Paper or Ask Questions

VSLAM-LAB: A Comprehensive Framework for Visual SLAM Methods and Datasets

Apr 06, 2025

Alejandro Fontan, Tobias Fischer, Javier Civera, Michael Milford

Figure 1 for VSLAM-LAB: A Comprehensive Framework for Visual SLAM Methods and Datasets

Figure 2 for VSLAM-LAB: A Comprehensive Framework for Visual SLAM Methods and Datasets

Figure 3 for VSLAM-LAB: A Comprehensive Framework for Visual SLAM Methods and Datasets

Figure 4 for VSLAM-LAB: A Comprehensive Framework for Visual SLAM Methods and Datasets

Abstract:Visual Simultaneous Localization and Mapping (VSLAM) research faces significant challenges due to fragmented toolchains, complex system configurations, and inconsistent evaluation methodologies. To address these issues, we present VSLAM-LAB, a unified framework designed to streamline the development, evaluation, and deployment of VSLAM systems. VSLAM-LAB simplifies the entire workflow by enabling seamless compilation and configuration of VSLAM algorithms, automated dataset downloading and preprocessing, and standardized experiment design, execution, and evaluation--all accessible through a single command-line interface. The framework supports a wide range of VSLAM systems and datasets, offering broad compatibility and extendability while promoting reproducibility through consistent evaluation metrics and analysis tools. By reducing implementation complexity and minimizing configuration overhead, VSLAM-LAB empowers researchers to focus on advancing VSLAM methodologies and accelerates progress toward scalable, real-world solutions. We demonstrate the ease with which user-relevant benchmarks can be created: here, we introduce difficulty-level-based categories, but one could envision environment-specific or condition-specific categories.

Via

Access Paper or Ask Questions

FlowR: Flowing from Sparse to Dense 3D Reconstructions

Apr 02, 2025

Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang, Nikhil Varma Keetha, Lorenzo Porzi, Norman Müller, Katja Schwarz, Jonathon Luiten, Marc Pollefeys, Peter Kontschieder

Figure 1 for FlowR: Flowing from Sparse to Dense 3D Reconstructions

Figure 2 for FlowR: Flowing from Sparse to Dense 3D Reconstructions

Figure 3 for FlowR: Flowing from Sparse to Dense 3D Reconstructions

Figure 4 for FlowR: Flowing from Sparse to Dense 3D Reconstructions

Abstract:3D Gaussian splatting enables high-quality novel view synthesis (NVS) at real-time frame rates. However, its quality drops sharply as we depart from the training views. Thus, dense captures are needed to match the high-quality expectations of some applications, e.g. Virtual Reality (VR). However, such dense captures are very laborious and expensive to obtain. Existing works have explored using 2D generative models to alleviate this requirement by distillation or generating additional training views. These methods are often conditioned only on a handful of reference input views and thus do not fully exploit the available 3D information, leading to inconsistent generation results and reconstruction artifacts. To tackle this problem, we propose a multi-view, flow matching model that learns a flow to connect novel view renderings from possibly sparse reconstructions to renderings that we expect from dense reconstructions. This enables augmenting scene captures with novel, generated views to improve reconstruction quality. Our model is trained on a novel dataset of 3.6M image pairs and can process up to 45 views at 540x960 resolution (91K tokens) on one H100 GPU in a single forward pass. Our pipeline consistently improves NVS in sparse- and dense-view scenarios, leading to higher-quality reconstructions than prior works across multiple, widely-used NVS benchmarks.

* Project page is available at https://tobiasfshr.github.io/pub/flowr

Via

Access Paper or Ask Questions

Improving Visual Place Recognition with Sequence-Matching Receptiveness Prediction

Mar 10, 2025

Somayeh Hussaini, Tobias Fischer, Michael Milford

Figure 1 for Improving Visual Place Recognition with Sequence-Matching Receptiveness Prediction

Figure 2 for Improving Visual Place Recognition with Sequence-Matching Receptiveness Prediction

Figure 3 for Improving Visual Place Recognition with Sequence-Matching Receptiveness Prediction

Figure 4 for Improving Visual Place Recognition with Sequence-Matching Receptiveness Prediction

Abstract:In visual place recognition (VPR), filtering and sequence-based matching approaches can improve performance by integrating temporal information across image sequences, especially in challenging conditions. While these methods are commonly applied, their effects on system behavior can be unpredictable and can actually make performance worse in certain situations. In this work, we present a new supervised learning approach that learns to predict the per-frame sequence matching receptiveness (SMR) of VPR techniques, enabling the system to selectively decide when to trust the output of a sequence matching system. The approach is agnostic to the underlying VPR technique. Our approach predicts SMR-and hence significantly improves VPR performance-across a large range of state-of-the-art and classical VPR techniques (namely CosPlace, MixVPR, EigenPlaces, SALAD, AP-GeM, NetVLAD and SAD), and across three benchmark VPR datasets (Nordland, Oxford RobotCar, and SFU-Mountain). We also provide insights into a complementary approach that uses the predictor to replace discarded matches, as well as ablation studies, including an analysis of the interactions between our SMR predictor and the selected sequence length. We will release our code upon acceptance.

* 8 pages, 5 figures, under review

Via

Access Paper or Ask Questions