Alert button
Picture for Hado van Hasselt

Hado van Hasselt

Alert button

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Dec 02, 2023
Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Thomas Mesnard, Hado van Hasselt, Laura Toni

Viaarxiv icon

A Definition of Continual Reinforcement Learning

Jul 20, 2023
David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh

Figure 1 for A Definition of Continual Reinforcement Learning
Figure 2 for A Definition of Continual Reinforcement Learning
Figure 3 for A Definition of Continual Reinforcement Learning
Viaarxiv icon

On the Convergence of Bounded Agents

Jul 20, 2023
David Abel, André Barreto, Hado van Hasselt, Benjamin Van Roy, Doina Precup, Satinder Singh

Figure 1 for On the Convergence of Bounded Agents
Viaarxiv icon

Exploration via Epistemic Value Estimation

Mar 07, 2023
Simon Schmitt, John Shawe-Taylor, Hado van Hasselt

Figure 1 for Exploration via Epistemic Value Estimation
Figure 2 for Exploration via Epistemic Value Estimation
Figure 3 for Exploration via Epistemic Value Estimation
Figure 4 for Exploration via Epistemic Value Estimation
Viaarxiv icon

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

Feb 08, 2023
Chentian Jiang, Nan Rosemary Ke, Hado van Hasselt

Figure 1 for Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Figure 2 for Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Figure 3 for Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Figure 4 for Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Viaarxiv icon

Optimistic Meta-Gradients

Jan 09, 2023
Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh

Figure 1 for Optimistic Meta-Gradients
Figure 2 for Optimistic Meta-Gradients
Figure 3 for Optimistic Meta-Gradients
Figure 4 for Optimistic Meta-Gradients
Viaarxiv icon

Human-level Atari 200x faster

Sep 15, 2022
Steven Kapturowski, Víctor Campos, Ray Jiang, Nemanja Rakićević, Hado van Hasselt, Charles Blundell, Adrià Puigdomènech Badia

Figure 1 for Human-level Atari 200x faster
Figure 2 for Human-level Atari 200x faster
Figure 3 for Human-level Atari 200x faster
Figure 4 for Human-level Atari 200x faster
Viaarxiv icon

Selective Credit Assignment

Feb 20, 2022
Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt

Figure 1 for Selective Credit Assignment
Figure 2 for Selective Credit Assignment
Figure 3 for Selective Credit Assignment
Figure 4 for Selective Credit Assignment
Viaarxiv icon

Chaining Value Functions for Off-Policy Learning

Feb 02, 2022
Simon Schmitt, John Shawe-Taylor, Hado van Hasselt

Figure 1 for Chaining Value Functions for Off-Policy Learning
Figure 2 for Chaining Value Functions for Off-Policy Learning
Figure 3 for Chaining Value Functions for Off-Policy Learning
Figure 4 for Chaining Value Functions for Off-Policy Learning
Viaarxiv icon