Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Milan Češka

\textsc{rfPG}: Robust Finite-Memory Policy Gradients for Hidden-Model POMDPs

May 14, 2025

Maris F. L. Galesloot, Roman Andriushchenko, Milan Češka, Sebastian Junges, Nils Jansen

Abstract:Partially observable Markov decision processes (POMDPs) model specific environments in sequential decision-making under uncertainty. Critically, optimal policies for POMDPs may not be robust against perturbations in the environment. Hidden-model POMDPs (HM-POMDPs) capture sets of different environment models, that is, POMDPs with a shared action and observation space. The intuition is that the true model is hidden among a set of potential models, and it is unknown which model will be the environment at execution time. A policy is robust for a given HM-POMDP if it achieves sufficient performance for each of its POMDPs. We compute such robust policies by combining two orthogonal techniques: (1) a deductive formal verification technique that supports tractable robust policy evaluation by computing a worst-case POMDP within the HM-POMDP and (2) subgradient ascent to optimize the candidate policy for a worst-case POMDP. The empirical evaluation shows that, compared to various baselines, our approach (1) produces policies that are more robust and generalize better to unseen POMDPs and (2) scales to HM-POMDPs that consist of over a hundred thousand environments.

* Accepted for publication at IJCAI 2025

Via

Access Paper or Ask Questions

Decentralized Planning Using Probabilistic Hyperproperties

Feb 19, 2025

Francesco Pontiggia, Filip Macák, Roman Andriushchenko, Michele Chiari, Milan Češka

Figure 1 for Decentralized Planning Using Probabilistic Hyperproperties

Figure 2 for Decentralized Planning Using Probabilistic Hyperproperties

Figure 3 for Decentralized Planning Using Probabilistic Hyperproperties

Abstract:Multi-agent planning under stochastic dynamics is usually formalised using decentralized (partially observable) Markov decision processes ( MDPs) and reachability or expected reward specifications. In this paper, we propose a different approach: we use an MDP describing how a single agent operates in an environment and probabilistic hyperproperties to capture desired temporal objectives for a set of decentralized agents operating in the environment. We extend existing approaches for model checking probabilistic hyperproperties to handle temporal formulae relating paths of different agents, thus requiring the self-composition between multiple MDPs. Using several case studies, we demonstrate that our approach provides a flexible and expressive framework to broaden the specification capabilities with respect to existing planning techniques. Additionally, we establish a close connection between a subclass of probabilistic hyperproperties and planning for a particular type of Dec-MDPs, for both of which we show undecidability. This lays the ground for the use of existing decentralized planning tools in the field of probabilistic hyperproperty verification.

* 11 pages, 1 figure, 2 tables. Accepted at AAMAS 2025: the 24th International Conference on Autonomous Agents and Multiagent Systems

Via

Access Paper or Ask Questions

Counterexample-Driven Synthesis for Probabilistic Program Sketches

Apr 28, 2019

Milan Češka, Christian Hensel, Sebastian Junges, Joost-Pieter Katoen

Figure 1 for Counterexample-Driven Synthesis for Probabilistic Program Sketches

Figure 2 for Counterexample-Driven Synthesis for Probabilistic Program Sketches

Figure 3 for Counterexample-Driven Synthesis for Probabilistic Program Sketches

Figure 4 for Counterexample-Driven Synthesis for Probabilistic Program Sketches

Abstract:Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative designs. To mitigate this complexity, we adopt counterexample-guided inductive synthesis (CEGIS) to automatically synthesise finite-state probabilistic programs. Our approach leverages efficient model checking, modern SMT solving, and counterexample generation at program level. Experiments on practically relevant case studies show that design spaces with millions of candidate designs can be fully explored using a few thousand verification queries.

* Extended version

Via

Access Paper or Ask Questions