Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marc Toussaint

Variance-Reduced Model Predictive Path Integral via Quadratic Model Approximation

Feb 03, 2026

Fabian Schramm, Franki Nguimatsia Tiofack, Nicolas Perrin-Gilbert, Marc Toussaint, Justin Carpentier

Abstract:Sampling-based controllers, such as Model Predictive Path Integral (MPPI) methods, offer substantial flexibility but often suffer from high variance and low sample efficiency. To address these challenges, we introduce a hybrid variance-reduced MPPI framework that integrates a prior model into the sampling process. Our key insight is to decompose the objective function into a known approximate model and a residual term. Since the residual captures only the discrepancy between the model and the objective, it typically exhibits a smaller magnitude and lower variance than the original objective. Although this principle applies to general modeling choices, we demonstrate that adopting a quadratic approximation enables the derivation of a closed-form, model-guided prior that effectively concentrates samples in informative regions. Crucially, the framework is agnostic to the source of geometric information, allowing the quadratic model to be constructed from exact derivatives, structural approximations (e.g., Gauss- or Quasi-Newton), or gradient-free randomized smoothing. We validate the approach on standard optimization benchmarks, a nonlinear, underactuated cart-pole control task, and a contact-rich manipulation problem with non-smooth dynamics. Across these domains, we achieve faster convergence and superior performance in low-sample regimes compared to standard MPPI. These results suggest that the method can make sample-based control strategies more practical in scenarios where obtaining samples is expensive or limited.

Via

Access Paper or Ask Questions

Masked Registration and Autoencoding of CT Images for Predictive Tibia Reconstruction

Dec 10, 2025

Hongyou Zhou, Cederic Aßmann, Alaa Bejaoui, Heiko Tzschätzsch, Mark Heyland, Julian Zierke, Niklas Tuttle, Sebastian Hölzl, Timo Auer, David A. Back(+1 more)

Abstract:Surgical planning for complex tibial fractures can be challenging for surgeons, as the 3D structure of the later desirable bone alignment may be difficult to imagine. To assist in such planning, we address the challenge of predicting a patient-specific reconstruction target from a CT of the fractured tibia. Our approach combines neural registration and autoencoder models. Specifically, we first train a modified spatial transformer network (STN) to register a raw CT to a standardized coordinate system of a jointly trained tibia prototype. Subsequently, various autoencoder (AE) architectures are trained to model healthy tibial variations. Both the STN and AE models are further designed to be robust to masked input, allowing us to apply them to fractured CTs and decode to a prediction of the patient-specific healthy bone in standard coordinates. Our contributions include: i) a 3D-adapted STN for global spatial registration, ii) a comparative analysis of AEs for bone CT modeling, and iii) the extension of both to handle masked inputs for predictive generation of healthy bone structures. Project page: https://github.com/HongyouZhou/repair

* DGM4MICCAI

Via

Access Paper or Ask Questions

CrazyMARL: Decentralized Direct Motor Control Policies for Cooperative Aerial Transport of Cable-Suspended Payloads

Sep 17, 2025

Viktor Lorentz, Khaled Wahba, Sayantan Auddy, Marc Toussaint, Wolfgang Hönig

Abstract:Collaborative transportation of cable-suspended payloads by teams of Unmanned Aerial Vehicles (UAVs) has the potential to enhance payload capacity, adapt to different payload shapes, and provide built-in compliance, making it attractive for applications ranging from disaster relief to precision logistics. However, multi-UAV coordination under disturbances, nonlinear payload dynamics, and slack--taut cable modes remains a challenging control problem. To our knowledge, no prior work has addressed these cable mode transitions in the multi-UAV context, instead relying on simplifying rigid-link assumptions. We propose CrazyMARL, a decentralized Reinforcement Learning (RL) framework for multi-UAV cable-suspended payload transport. Simulation results demonstrate that the learned policies can outperform classical decentralized controllers in terms of disturbance rejection and tracking precision, achieving an 80% recovery rate from harsh conditions compared to 44% for the baseline method. We also achieve successful zero-shot sim-to-real transfer and demonstrate that our policies are highly robust under harsh conditions, including wind, random external disturbances, and transitions between slack and taut cable dynamics. This work paves the way for autonomous, resilient UAV teams capable of executing complex payload missions in unstructured environments.

* This work has been submitted to IEEE for possible publication

Via

Access Paper or Ask Questions

SVN-ICP: Uncertainty Estimation of ICP-based LiDAR Odometry using Stein Variational Newton

Sep 09, 2025

Shiping Ma, Haoming Zhang, Marc Toussaint

Abstract:This letter introduces SVN-ICP, a novel Iterative Closest Point (ICP) algorithm with uncertainty estimation that leverages Stein Variational Newton (SVN) on manifold. Designed specifically for fusing LiDAR odometry in multisensor systems, the proposed method ensures accurate pose estimation and consistent noise parameter inference, even in LiDAR-degraded environments. By approximating the posterior distribution using particles within the Stein Variational Inference framework, SVN-ICP eliminates the need for explicit noise modeling or manual parameter tuning. To evaluate its effectiveness, we integrate SVN-ICP into a simple error-state Kalman filter alongside an IMU and test it across multiple datasets spanning diverse environments and robot types. Extensive experimental results demonstrate that our approach outperforms best-in-class methods on challenging scenarios while providing reliable uncertainty estimates.

Via

Access Paper or Ask Questions

Regrasp Maps for Sequential Manipulation Planning

Jul 16, 2025

Svetlana Levit, Marc Toussaint

Abstract:We consider manipulation problems in constrained and cluttered settings, which require several regrasps at unknown locations. We propose to inform an optimization-based task and motion planning (TAMP) solver with possible regrasp areas and grasp sequences to speed up the search. Our main idea is to use a state space abstraction, a regrasp map, capturing the combinations of available grasps in different parts of the configuration space, and allowing us to provide the solver with guesses for the mode switches and additional constraints for the object placements. By interleaving the creation of regrasp maps, their adaptation based on failed refinements, and solving TAMP (sub)problems, we are able to provide a robust search method for challenging regrasp manipulation problems.

Via

Access Paper or Ask Questions

Meta-Optimization and Program Search using Language Models for Task and Motion Planning

May 06, 2025

Denis Shcherba, Eckart Cobo-Briesewitz, Cornelius V. Braun, Marc Toussaint

Abstract:Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic planning and continuous trajectory generation. Recently, foundation model approaches to TAMP have presented impressive results, including fast planning times and the execution of natural language instructions. Yet, the optimal interface between high-level planning and low-level motion generation remains an open question: prior approaches are limited by either too much abstraction (e.g., chaining simplified skill primitives) or a lack thereof (e.g., direct joint angle prediction). Our method introduces a novel technique employing a form of meta-optimization to address these issues by: (i) using program search over trajectory optimization problems as an interface between a foundation model and robot control, and (ii) leveraging a zero-order method to optimize numerical parameters in the foundation model output. Results on challenging object manipulation and drawing tasks confirm that our proposed method improves over prior TAMP approaches.

* 20 pages, 8 figures, under review for the 9th Annual Conference on Robot Learning (CoRL 2025)

Via

Access Paper or Ask Questions

Amortized Safe Active Learning for Real-Time Decision-Making: Pretrained Neural Policies from Simulated Nonparametric Functions

Jan 26, 2025

Cen-You Li, Marc Toussaint, Barbara Rakitsch, Christoph Zimmer

Abstract:Active Learning (AL) is a sequential learning approach aiming at selecting the most informative data for model training. In many systems, safety constraints appear during data evaluation, requiring the development of safe AL methods. Key challenges of AL are the repeated model training and acquisition optimization required for data selection, which become particularly restrictive under safety constraints. This repeated effort often creates a bottleneck, especially in physical systems requiring real-time decision-making. In this paper, we propose a novel amortized safe AL framework. By leveraging a pretrained neural network policy, our method eliminates the need for repeated model training and acquisition optimization, achieving substantial speed improvements while maintaining competitive learning outcomes and safety awareness. The policy is trained entirely on synthetic data utilizing a novel safe AL objective. The resulting policy is highly versatile and adapts to a wide range of systems, as we demonstrate in our experiments. Furthermore, our framework is modular and we empirically show that we also achieve superior performance for unconstrained time-sensitive AL tasks if we omit the safety requirement.

* Part of the content published earlier at arXiv:2407.17992

Via

Access Paper or Ask Questions

GSRM: Building Roadmaps for Query-Efficient and Near-Optimal Path Planning Using a Reaction Diffusion System

Oct 14, 2024

Christian Henkel, Marc Toussaint, Wolfgang Hönig

Abstract:Mobile robots frequently navigate on roadmaps, i.e., graphs where edges represent safe motions, in applications such as healthcare, hospitality, and warehouse automation. Often the environment is quasi-static, i.e., it is sufficient to construct a roadmap once and then use it for any future planning queries. Roadmaps are typically used with graph search algorithm to find feasible paths for the robots. Therefore, the roadmap should be well-connected, and graph searches should produce near-optimal solutions with short solution paths while simultaneously be computationally efficient to execute queries quickly. We propose a new method to construct roadmaps based on the Gray-Scott reaction diffusion system and Delaunay triangulation. Our approach, GSRM, produces roadmaps with evenly distributed vertices and edges that are well-connected even in environments with challenging narrow passages. Empirically, we compare to classical roadmaps generated by 8-connected grids, probabilistic roadmaps (PRM, SPARS2), and optimized roadmap graphs (ORM). Our results show that GSRM consistently produces superior roadmaps that are well-connected, have high query efficiency, and result in short solution paths.

* 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
* Presented at IROS 2024

Via

Access Paper or Ask Questions

Stein Variational Evolution Strategies

Oct 14, 2024

Cornelius V. Braun, Robert T. Lange, Marc Toussaint

Figure 1 for Stein Variational Evolution Strategies

Figure 2 for Stein Variational Evolution Strategies

Figure 3 for Stein Variational Evolution Strategies

Figure 4 for Stein Variational Evolution Strategies

Abstract:Stein Variational Gradient Descent (SVGD) is a highly efficient method to sample from an unnormalized probability distribution. However, the SVGD update relies on gradients of the log-density, which may not always be available. Existing gradient-free versions of SVGD make use of simple Monte Carlo approximations or gradients from surrogate distributions, both with limitations. To improve gradient-free Stein variational inference, we combine SVGD steps with evolution strategy (ES) updates. Our results demonstrate that the resulting algorithm generates high-quality samples from unnormalized target densities without requiring gradient information. Compared to prior gradient-free SVGD methods, we find that the integration of the ES update in SVGD significantly improves the performance on multiple challenging benchmark problems.

Via

Access Paper or Ask Questions

Amortized Active Learning for Nonparametric Functions

Jul 25, 2024

Cen-You Li, Marc Toussaint, Barbara Rakitsch, Christoph Zimmer

Figure 1 for Amortized Active Learning for Nonparametric Functions

Figure 2 for Amortized Active Learning for Nonparametric Functions

Figure 3 for Amortized Active Learning for Nonparametric Functions

Figure 4 for Amortized Active Learning for Nonparametric Functions

Abstract:Active learning (AL) is a sequential learning scheme aiming to select the most informative data. AL reduces data consumption and avoids the cost of labeling large amounts of data. However, AL trains the model and solves an acquisition optimization for each selection. It becomes expensive when the model training or acquisition optimization is challenging. In this paper, we focus on active nonparametric function learning, where the gold standard Gaussian process (GP) approaches suffer from cubic time complexity. We propose an amortized AL method, where new data are suggested by a neural network which is trained up-front without any real data (Figure 1). Our method avoids repeated model training and requires no acquisition optimization during the AL deployment. We (i) utilize GPs as function priors to construct an AL simulator, (ii) train an AL policy that can zero-shot generalize from simulation to real learning problems of nonparametric functions and (iii) achieve real-time data selection and comparable learning performances to time-consuming baseline methods.

Via

Access Paper or Ask Questions