Alert button
Picture for John Quan

John Quan

Alert button

Vision-Language Models as a Source of Rewards

Add code
Bookmark button
Alert button
Dec 14, 2023
Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, Luyu Wang, Lei Zhang

Viaarxiv icon

The Phenomenon of Policy Churn

Add code
Bookmark button
Alert button
Jun 09, 2022
Tom Schaul, André Barreto, John Quan, Georg Ostrovski

Figure 1 for The Phenomenon of Policy Churn
Figure 2 for The Phenomenon of Policy Churn
Figure 3 for The Phenomenon of Policy Churn
Figure 4 for The Phenomenon of Policy Churn
Viaarxiv icon

Podracer architectures for scalable Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 13, 2021
Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt

Figure 1 for Podracer architectures for scalable Reinforcement Learning
Figure 2 for Podracer architectures for scalable Reinforcement Learning
Figure 3 for Podracer architectures for scalable Reinforcement Learning
Figure 4 for Podracer architectures for scalable Reinforcement Learning
Viaarxiv icon

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 03, 2020
Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

Figure 1 for The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Figure 2 for The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Figure 3 for The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Figure 4 for The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Viaarxiv icon

General non-linear Bellman equations

Add code
Bookmark button
Alert button
Jul 08, 2019
Hado van Hasselt, John Quan, Matteo Hessel, Zhongwen Xu, Diana Borsa, Andre Barreto

Figure 1 for General non-linear Bellman equations
Figure 2 for General non-linear Bellman equations
Viaarxiv icon

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement

Add code
Bookmark button
Alert button
Jan 30, 2019
André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Žídek, Rémi Munos

Figure 1 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 2 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 3 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 4 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Viaarxiv icon

Universal Successor Features Approximators

Add code
Bookmark button
Alert button
Dec 18, 2018
Diana Borsa, André Barreto, John Quan, Daniel Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul

Figure 1 for Universal Successor Features Approximators
Figure 2 for Universal Successor Features Approximators
Figure 3 for Universal Successor Features Approximators
Figure 4 for Universal Successor Features Approximators
Viaarxiv icon

Unicorn: Continual Learning with a Universal, Off-policy Agent

Add code
Bookmark button
Alert button
Jul 03, 2018
Daniel J. Mankowitz, Augustin Žídek, André Barreto, Dan Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver, Tom Schaul

Figure 1 for Unicorn: Continual Learning with a Universal, Off-policy Agent
Figure 2 for Unicorn: Continual Learning with a Universal, Off-policy Agent
Figure 3 for Unicorn: Continual Learning with a Universal, Off-policy Agent
Figure 4 for Unicorn: Continual Learning with a Universal, Off-policy Agent
Viaarxiv icon

Observe and Look Further: Achieving Consistent Performance on Atari

Add code
Bookmark button
Alert button
May 29, 2018
Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Figure 1 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 2 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 3 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 4 for Observe and Look Further: Achieving Consistent Performance on Atari
Viaarxiv icon

Distributed Prioritized Experience Replay

Add code
Bookmark button
Alert button
Mar 02, 2018
Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver

Figure 1 for Distributed Prioritized Experience Replay
Figure 2 for Distributed Prioritized Experience Replay
Figure 3 for Distributed Prioritized Experience Replay
Figure 4 for Distributed Prioritized Experience Replay
Viaarxiv icon