Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nir Ben Zrihem

Graying the black box: Understanding DQNs

Apr 24, 2017

Tom Zahavy, Nir Ben Zrihem, Shie Mannor

Figure 1 for Graying the black box: Understanding DQNs

Figure 2 for Graying the black box: Understanding DQNs

Figure 3 for Graying the black box: Understanding DQNs

Figure 4 for Graying the black box: Understanding DQNs

Abstract:In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

Via

Access Paper or Ask Questions

Visualizing Dynamics: from t-SNE to SEMI-MDPs

Jun 22, 2016

Nir Ben Zrihem, Tom Zahavy, Shie Mannor

Figure 1 for Visualizing Dynamics: from t-SNE to SEMI-MDPs

Figure 2 for Visualizing Dynamics: from t-SNE to SEMI-MDPs

Figure 3 for Visualizing Dynamics: from t-SNE to SEMI-MDPs

Abstract:Deep Reinforcement Learning (DRL) is a trending field of research, showing great promise in many challenging problems such as playing Atari, solving Go and controlling robots. While DRL agents perform well in practice we are still missing the tools to analayze their performance and visualize the temporal abstractions that they learn. In this paper, we present a novel method that automatically discovers an internal Semi Markov Decision Process (SMDP) model in the Deep Q Network's (DQN) learned representation. We suggest a novel visualization method that represents the SMDP model by a directed graph and visualize it above a t-SNE map. We show how can we interpret the agent's policy and give evidence for the hierarchical state aggregation that DQNs are learning automatically. Our algorithm is fully automatic, does not require any domain specific knowledge and is evaluated by a novel likelihood based evaluation criteria.

* Presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

Via

Access Paper or Ask Questions