Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muqsit Azeem

Resilient Strategies for Stochastic Systems: How Much Does It Take to Break a Winning Strategy?

Feb 27, 2026

Kush Grover, Markel Zubia, Debraj Chakraborty, Muqsit Azeem, Nils Jansen, Jan Kretinsky

Abstract:We study the problem of resilient strategies in the presence of uncertainty. Resilient strategies enable an agent to make decisions that are robust against disturbances. In particular, we are interested in those disturbances that are able to flip a decision made by the agent. Such a disturbance may, for instance, occur when the intended action of the agent cannot be executed due to a malfunction of an actuator in the environment. In this work, we introduce the concept of resilience in the stochastic setting and present a comprehensive set of fundamental problems. Specifically, we discuss such problems for Markov decision processes with reachability and safety objectives, which also smoothly extend to stochastic games. To account for the stochastic setting, we provide various ways of aggregating the amounts of disturbances that may have occurred, for instance, in expectation or in the worst case. Moreover, to reason about infinite disturbances, we use quantitative measures, like their frequency of occurrence.

* To appear in Proc. of the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026), Paphos, Cyprus, May 25-29, 2026

Via

Access Paper or Ask Questions

Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Nov 20, 2024

Muqsit Azeem, Debraj Chakraborty, Sudeep Kanav, Jan Kretinsky

Figure 1 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 2 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 3 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Figure 4 for Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes

Abstract:Partially Observable Markov Decision Processes (POMDPs) are a fundamental framework for decision-making under uncertainty and partial observability. Since in general optimal policies may require infinite memory, they are hard to implement and often render most problems undecidable. Consequently, finite-memory policies are mostly considered instead. However, the algorithms for computing them are typically very complex, and so are the resulting policies. Facing the need for their explainability, we provide a representation of such policies, both (i) in an interpretable formalism and (ii) typically of smaller size, together yielding higher explainability. To that end, we combine models of Mealy machines and decision trees; the latter describing simple, stationary parts of the policies and the former describing how to switch among them. We design a translation for policies of the finite-state-controller (FSC) form from standard literature and show how our method smoothly generalizes to other variants of finite-memory policies. Further, we identify specific properties of recently used "attractor-based" policies, which allow us to construct yet simpler and smaller representations. Finally, we illustrate the higher explainability in a few case studies.

* Preprint -- Under Review

Via

Access Paper or Ask Questions

1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization

Oct 23, 2024

Muqsit Azeem, Debraj Chakraborty, Sudeep Kanav, Jan Kretinsky, Mohammadsadegh Mohagheghi, Stefanie Mohr, Maximilian Weininger

Abstract:Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such \emph{huge} MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.

* Preprint. Under review

Via

Access Paper or Ask Questions

Monitizer: Automating Design and Evaluation of Neural Network Monitors

May 16, 2024

Muqsit Azeem, Marta Grobelna, Sudeep Kanav, Jan Kretinsky, Stefanie Mohr, Sabine Rieder

Figure 1 for Monitizer: Automating Design and Evaluation of Neural Network Monitors

Figure 2 for Monitizer: Automating Design and Evaluation of Neural Network Monitors

Figure 3 for Monitizer: Automating Design and Evaluation of Neural Network Monitors

Figure 4 for Monitizer: Automating Design and Evaluation of Neural Network Monitors

Abstract:The behavior of neural networks (NNs) on previously unseen types of data (out-of-distribution or OOD) is typically unpredictable. This can be dangerous if the network's output is used for decision-making in a safety-critical system. Hence, detecting that an input is OOD is crucial for the safe application of the NN. Verification approaches do not scale to practical NNs, making runtime monitoring more appealing for practical use. While various monitors have been suggested recently, their optimization for a given problem, as well as comparison with each other and reproduction of results, remain challenging. We present a tool for users and developers of NN monitors. It allows for (i) application of various types of monitors from the literature to a given input NN, (ii) optimization of the monitor's hyperparameters, and (iii) experimental evaluation and comparison to other approaches. Besides, it facilitates the development of new monitoring approaches. We demonstrate the tool's usability on several use cases of different types of users as well as on a case study comparing different approaches from recent literature.

* accepted at CAV 2024

Via

Access Paper or Ask Questions