Alert button
Picture for Shimon Whiteson

Shimon Whiteson

Alert button

Learning with Opponent-Learning Awareness

Add code
Bookmark button
Alert button
Sep 19, 2018
Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

Figure 1 for Learning with Opponent-Learning Awareness
Figure 2 for Learning with Opponent-Learning Awareness
Figure 3 for Learning with Opponent-Learning Awareness
Figure 4 for Learning with Opponent-Learning Awareness
Viaarxiv icon

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Add code
Bookmark button
Alert button
Sep 19, 2018
Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

Figure 1 for DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Figure 2 for DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Figure 3 for DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Figure 4 for DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Viaarxiv icon

Fingerprint Policy Optimisation for Robust Reinforcement Learning

Add code
Bookmark button
Alert button
Sep 15, 2018
Supratik Paul, Michael A. Osborne, Shimon Whiteson

Figure 1 for Fingerprint Policy Optimisation for Robust Reinforcement Learning
Figure 2 for Fingerprint Policy Optimisation for Robust Reinforcement Learning
Figure 3 for Fingerprint Policy Optimisation for Robust Reinforcement Learning
Figure 4 for Fingerprint Policy Optimisation for Robust Reinforcement Learning
Viaarxiv icon

TACO: Learning Task Decomposition via Temporal Alignment for Control

Add code
Bookmark button
Alert button
Aug 10, 2018
Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner

Figure 1 for TACO: Learning Task Decomposition via Temporal Alignment for Control
Figure 2 for TACO: Learning Task Decomposition via Temporal Alignment for Control
Figure 3 for TACO: Learning Task Decomposition via Temporal Alignment for Control
Figure 4 for TACO: Learning Task Decomposition via Temporal Alignment for Control
Viaarxiv icon

Deep Variational Reinforcement Learning for POMDPs

Add code
Bookmark button
Alert button
Jun 06, 2018
Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson

Figure 1 for Deep Variational Reinforcement Learning for POMDPs
Figure 2 for Deep Variational Reinforcement Learning for POMDPs
Figure 3 for Deep Variational Reinforcement Learning for POMDPs
Figure 4 for Deep Variational Reinforcement Learning for POMDPs
Viaarxiv icon

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 06, 2018
Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

Figure 1 for QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 2 for QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 3 for QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 4 for QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Viaarxiv icon

Fourier Policy Gradients

Add code
Bookmark button
Alert button
May 30, 2018
Matthew Fellows, Kamil Ciosek, Shimon Whiteson

Figure 1 for Fourier Policy Gradients
Viaarxiv icon

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
May 21, 2018
Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson

Figure 1 for Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Figure 2 for Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Figure 3 for Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Figure 4 for Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Viaarxiv icon

Expected Policy Gradients

Add code
Bookmark button
Alert button
Apr 13, 2018
Kamil Ciosek, Shimon Whiteson

Figure 1 for Expected Policy Gradients
Figure 2 for Expected Policy Gradients
Viaarxiv icon

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 08, 2018
Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson

Figure 1 for TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
Figure 2 for TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
Figure 3 for TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
Figure 4 for TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
Viaarxiv icon