Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Max Peter Ronecker

Vision Foundation Model Embedding-Based Semantic Anomaly Detection

May 12, 2025

Max Peter Ronecker, Matthew Foutter, Amine Elhafsi, Daniele Gammelli, Ihor Barakaiev, Marco Pavone, Daniel Watzenig

Abstract:Semantic anomalies are contextually invalid or unusual combinations of familiar visual elements that can cause undefined behavior and failures in system-level reasoning for autonomous systems. This work explores semantic anomaly detection by leveraging the semantic priors of state-of-the-art vision foundation models, operating directly on the image. We propose a framework that compares local vision embeddings from runtime images to a database of nominal scenarios in which the autonomous system is deemed safe and performant. In this work, we consider two variants of the proposed framework: one using raw grid-based embeddings, and another leveraging instance segmentation for object-centric representations. To further improve robustness, we introduce a simple filtering mechanism to suppress false positives. Our evaluations on CARLA-simulated anomalies show that the instance-based method with filtering achieves performance comparable to GPT-4o, while providing precise anomaly localization. These results highlight the potential utility of vision embeddings from foundation models for real-time anomaly detection in autonomous systems.

* Accepted for the Workshop "Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities" at ICRA 2025

Via

Access Paper or Ask Questions

LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring

Apr 25, 2025

Raul David Dominguez Sanchez, Xavier Diaz Ortiz, Xingcheng Zhou, Max Peter Ronecker, Michael Karner, Daniel Watzenig, Alois Knoll

Abstract:Railway systems, particularly in Germany, require high levels of automation to address legacy infrastructure challenges and increase train traffic safely. A key component of automation is robust long-range perception, essential for early hazard detection, such as obstacles at level crossings or pedestrians on tracks. Unlike automotive systems with braking distances of ~70 meters, trains require perception ranges exceeding 1 km. This paper presents an deep-learning-based approach for long-range 3D object detection tailored for autonomous trains. The method relies solely on monocular images, inspired by the Faraway-Frustum approach, and incorporates LiDAR data during training to improve depth estimation. The proposed pipeline consists of four key modules: (1) a modified YOLOv9 for 2.5D object detection, (2) a depth estimation network, and (3-4) dedicated short- and long-range 3D detection heads. Evaluations on the OSDaR23 dataset demonstrate the effectiveness of the approach in detecting objects up to 250 meters. Results highlight its potential for railway automation and outline areas for future improvement.

* Accepted for the Data-Driven Learning for Intelligent Vehicle Applications Workshop at the 36th IEEE Intelligent Vehicles Symposium (IV) 2025

Via

Access Paper or Ask Questions

A Data-Centric Approach to 3D Semantic Segmentation of Railway Scenes

Apr 25, 2025

Nicolas Münger, Max Peter Ronecker, Xavier Diaz, Michael Karner, Daniel Watzenig, Jan Skaloud

Abstract:LiDAR-based semantic segmentation is critical for autonomous trains, requiring accurate predictions across varying distances. This paper introduces two targeted data augmentation methods designed to improve segmentation performance on the railway-specific OSDaR23 dataset. The person instance pasting method enhances segmentation of pedestrians at distant ranges by injecting realistic variations into the dataset. The track sparsification method redistributes point density in LiDAR scans, improving track segmentation at far distances with minimal impact on close-range accuracy. Both methods are evaluated using a state-of-the-art 3D semantic segmentation network, demonstrating significant improvements in distant-range performance while maintaining robustness in close-range predictions. We establish the first 3D semantic segmentation benchmark for OSDaR23, demonstrating the potential of data-centric approaches to address railway-specific challenges in autonomous train perception.

* Accepted at the 28th Computer Vision Winter Workshop 2025

Via

Access Paper or Ask Questions

Deep Learning-Driven State Correction: A Hybrid Architecture for Radar-Based Dynamic Occupancy Grid Mapping

May 22, 2024

Max Peter Ronecker, Xavier Diaz, Michael Karner, Daniel Watzenig

Figure 1 for Deep Learning-Driven State Correction: A Hybrid Architecture for Radar-Based Dynamic Occupancy Grid Mapping

Figure 2 for Deep Learning-Driven State Correction: A Hybrid Architecture for Radar-Based Dynamic Occupancy Grid Mapping

Figure 3 for Deep Learning-Driven State Correction: A Hybrid Architecture for Radar-Based Dynamic Occupancy Grid Mapping

Figure 4 for Deep Learning-Driven State Correction: A Hybrid Architecture for Radar-Based Dynamic Occupancy Grid Mapping

Abstract:This paper introduces a novel hybrid architecture that enhances radar-based Dynamic Occupancy Grid Mapping (DOGM) for autonomous vehicles, integrating deep learning for state-classification. Traditional radar-based DOGM often faces challenges in accurately distinguishing between static and dynamic objects. Our approach addresses this limitation by introducing a neural network-based DOGM state correction mechanism, designed as a semantic segmentation task, to refine the accuracy of the occupancy grid. Additionally a heuristic fusion approach is proposed which allows to enhance performance without compromising on safety. We extensively evaluate this hybrid architecture on the NuScenes Dataset, focusing on its ability to improve dynamic object detection as well grid quality. The results show clear improvements in the detection capabilities of dynamic objects, highlighting the effectiveness of the deep learning-enhanced state correction in radar-based DOGM.

* Accepted at 35th IEEE Intelligent Vehicles Symposium (IV) 2024

Via

Access Paper or Ask Questions

Dynamic Occupancy Grids for Object Detection: A Radar-Centric Approach

Feb 02, 2024

Max Peter Ronecker, Markus Schratter, Lukas Kuschnig, Daniel Watzenig

Abstract:Dynamic Occupancy Grid Mapping is a technique used to generate a local map of the environment containing both static and dynamic information. Typically, these maps are primarily generated using lidar measurements. However, with improvements in radar sensing, resulting in better accuracy and higher resolution, radar is emerging as a viable alternative to lidar as the primary sensor for mapping. In this paper, we propose a radar-centric dynamic occupancy grid mapping algorithm with adaptations to the state computation, inverse sensor model, and field-of-view computation tailored to the specifics of radar measurements. We extensively evaluate our approach using real data to demonstrate its effectiveness and establish the first benchmark for radar-based dynamic occupancy grid mapping using the publicly available Radarscenes dataset.

* Accepted at 2024 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Deep Q-Network Based Decision Making for Autonomous Driving

Mar 21, 2023

Max Peter Ronecker, Yuan Zhu

Abstract:Currently decision making is one of the biggest challenges in autonomous driving. This paper introduces a method for safely navigating an autonomous vehicle in highway scenarios by combining deep Q-Networks and insight from control theory. A Deep Q-Network is trained in simulation to serve as a central decision-making unit by proposing targets for a trajectory planner. The generated trajectories in combination with a controller for longitudinal movement are used to execute lane change maneuvers. In order to prove the functionality of this approach it is evaluated on two different highway traffic scenarios. Furthermore, the impact of different state representations on the performance and training process is analyzed. The results show that the proposed system can produce efficient and safe driving behavior.

* Accepted at 2019 International Conference on Robotics and Automation Sciences (ICRAS)

Via

Access Paper or Ask Questions

Dual-Weight Particle Filter for Radar-Based Dynamic Bayesian Grid Maps

Mar 20, 2023

Max Peter Ronecker, Michael Stolz, Daniel Watzenig

Figure 1 for Dual-Weight Particle Filter for Radar-Based Dynamic Bayesian Grid Maps

Figure 2 for Dual-Weight Particle Filter for Radar-Based Dynamic Bayesian Grid Maps

Figure 3 for Dual-Weight Particle Filter for Radar-Based Dynamic Bayesian Grid Maps

Figure 4 for Dual-Weight Particle Filter for Radar-Based Dynamic Bayesian Grid Maps

Abstract:Through constant improvements in recent years radar sensors have become a viable alternative to lidar as the main distancing sensor of an autonomous vehicle. Although robust and with the possibility to directly measure the radial velocity, it brings it's own set of challenges, for which existing algorithms need to be adapted. One core algorithm of a perception system is dynamic occupancy grid mapping, which has traditionally relied on lidar. In this paper we present a dual-weight particle filter as an extension for a Bayesian occupancy grid mapping framework to allow to operate it with radar as its main sensors. It uses two separate particle weights that are computed differently to compensate that a radial velocity measurement in many situations is not able to capture the actual velocity of an object. We evaluate the method extensively with simulated data and show the advantages over existing single weight solutions.

* Accepted at 2023 IEEE International Conference on Mobility: Operations, Services, and Technologies (MOST)

Via

Access Paper or Ask Questions