Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Robert Gieselmann

Efficient Test-time Inference for Generative Planning Models

May 30, 2026

Robert Gieselmann, Mihai Samson, Federico Pecora, Jeremy L. Wyatt

Abstract:Generative models have emerged as a powerful paradigm for AI planning, yet their performance remains constrained by the training data distribution. One approach is to improve generated solutions during inference by scaling test-time compute. A more efficient alternative is to optimize the inference process itself. In this paper, we show that a modified version of a classical Open-Closed List (OCL) search provides just such an efficient inference procedure. Our algorithm synergizes two learned components: a generative model that performs fast rollouts from intermediate states and a heuristic model that prioritizes among candidate reasoning paths. Key contributions include novel exploration control mechanisms and integration of learned models within the OCL framework. Across multiple combinatorial planning domains, our approach outperforms both neurosymbolic search baselines and classical solvers in computational efficiency and solution quality.

Via

Access Paper or Ask Questions

Self-Improvement for Fast, High-Quality Plan Generation

May 05, 2026

Robert Gieselmann, Henrike von Huelsen, Mihai Samson, Marie-Christine Meyer, Dariusz Piotrowski, Oleksandr Radomskyi, Justin Okamoto, Turan Gojayev, Michael Painter, Gavin Brown(+2 more)

Abstract:Generative models trained on synthetic plan data are a promising approach to generalized planning. Recent work has focused on finding any valid plan, rather than a high-quality solution. We address the challenge of producing high-quality plans, a computationally hard problem, in sub-exponential time. First, we demonstrate that, given optimal data, a decoder-only transformer can generate high-quality plans for unseen problem instances. Second, we show how to self-improve an initial model trained on sub-optimal data. Each round of self-improvement combines multiple model calls with graph search to generate improved plans, used for model fine-tuning. An experimental study on four domains: Blocksworld, Logistics, Labyrinth, and Sokoban, shows on average a 30% reduction in plan length over the source symbolic planner, with over 80% of plans being optimal, where the optimum is known. Plan quality is further improved by inference-time search. The model's latency scales sub-exponentially in contrast to the satisficing and optimal symbolic planners to which we compare. Together, these results suggest that self-improvement with generative models offers a scalable approach for high-quality plan generation.

* Accepted at ICAPS 2026

Via

Access Paper or Ask Questions

**Fast-dRRT*: Efficient Multi-Robot Motion Planning for Automated Industrial Manufacturing**

Sep 19, 2023

Andrey Solano, Arne Sieverling, Robert Gieselmann, Andreas Orthey

Figure 1 for Fast-dRRT*: Efficient Multi-Robot Motion Planning for Automated Industrial Manufacturing

Figure 2 for Fast-dRRT*: Efficient Multi-Robot Motion Planning for Automated Industrial Manufacturing

Figure 3 for Fast-dRRT*: Efficient Multi-Robot Motion Planning for Automated Industrial Manufacturing

Figure 4 for Fast-dRRT*: Efficient Multi-Robot Motion Planning for Automated Industrial Manufacturing

Abstract:We present Fast-dRRT*, a sampling-based multi-robot planner, for real-time industrial automation scenarios. Fast-dRRT* builds upon the discrete rapidly-exploring random tree (dRRT*) planner, and extends dRRT* by using pre-computed swept volumes for efficient collision detection, deadlock avoidance for partial multi-robot problems, and a simplified rewiring strategy. We evaluate Fast-dRRT* on five challenging multi-robot scenarios using two to four industrial robot arms from various manufacturers. The scenarios comprise situations involving deadlocks, narrow passages, and close proximity tasks. The results are compared against dRRT*, and show Fast-dRRT* to outperform dRRT* by up to 94% in terms of finding solutions within given time limits, while only sacrificing up to 35% on initial solution cost. Furthermore, Fast-dRRT* demonstrates resilience against noise in target configurations, and is able to solve challenging welding, and pick and place tasks with reduced computational time. This makes Fast-dRRT* a promising option for real-time motion planning in industrial automation.

* 7 pages, 6 figures, submitted to ICRA 2024

Via

Access Paper or Ask Questions

Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Feb 05, 2021

Julian Bernhard, Robert Gieselmann, Klemens Esterle, Alois Knoll

Figure 1 for Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Figure 2 for Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Figure 3 for Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Figure 4 for Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Abstract:Interaction-aware planning for autonomous driving requires an exploration of a combinatorial solution space when using conventional search- or optimization-based motion planners. With Deep Reinforcement Learning, optimal driving strategies for such problems can be derived also for higher-dimensional problems. However, these methods guarantee optimality of the resulting policy only in a statistical sense, which impedes their usage in safety critical systems, such as autonomous vehicles. Thus, we propose the Experience-Based-Heuristic-Search algorithm, which overcomes the statistical failure rate of a Deep-reinforcement-learning-based planner and still benefits computationally from the pre-learned optimal policy. Specifically, we show how experiences in the form of a Deep Q-Network can be integrated as heuristic into a heuristic search algorithm. We benchmark our algorithm in the field of path planning in semi-structured valet parking scenarios. There, we analyze the accuracy of such estimates and demonstrate the computational advantages and robustness of our method. Our method may encourage further investigation of the applicability of reinforcement-learning-based planning in the field of self-driving vehicles.

* published at IEEE IV 2018

Via

Access Paper or Ask Questions