Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Feb 13, 2020

Steven Carr, Nils Jansen, Ufuk Topcu

Figure 1 for Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Figure 2 for Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Figure 3 for Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Figure 4 for Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Share this with someone who'll enjoy it:

Abstract:Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems. However, a major drawback in the application of RNN-based policies is the difficulty in providing formal guarantees on the satisfaction of behavioral specifications, e.g. safety and/or reachability. By integrating techniques from formal methods and machine learning, we propose an approach to automatically extract a finite-state controller (FSC) from an RNN, which, when composed with a finite-state system model, is amenable to existing formal verification tools. Specifically, we introduce an iterative modification to the so-called quantized bottleneck insertion technique to create an FSC as a randomized policy with memory. For the cases in which the resulting FSC fails to satisfy the specification, verification generates diagnostic information. We utilize this information to either adjust the amount of memory in the extracted FSC or perform focused retraining of the RNN. While generally applicable, we detail the resulting iterative procedure in the context of policy synthesis for partially observable Markov decision processes (POMDPs), which is known to be notoriously hard. The numerical experiments show that the proposed approach outperforms traditional POMDP synthesis methods by 3 orders of magnitude within 2% of optimal benchmark values.

* 8 pages, 5 figures, 1 table

View paper on

Share this with someone who'll enjoy it:

Title:Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Paper and Code