Alert button
Picture for Nicolas Heess

Nicolas Heess

Alert button

Quinoa: a Q-function You Infer Normalized Over Actions

Nov 05, 2019
Jonas Degrave, Abbas Abdolmaleki, Jost Tobias Springenberg, Nicolas Heess, Martin Riedmiller

Figure 1 for Quinoa: a Q-function You Infer Normalized Over Actions
Figure 2 for Quinoa: a Q-function You Infer Normalized Over Actions
Viaarxiv icon

Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

Oct 15, 2019
Lars Buesing, Nicolas Heess, Theophane Weber

Figure 1 for Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
Figure 2 for Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
Figure 3 for Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
Figure 4 for Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
Viaarxiv icon

Stabilizing Transformers for Reinforcement Learning

Oct 13, 2019
Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell

Figure 1 for Stabilizing Transformers for Reinforcement Learning
Figure 2 for Stabilizing Transformers for Reinforcement Learning
Figure 3 for Stabilizing Transformers for Reinforcement Learning
Figure 4 for Stabilizing Transformers for Reinforcement Learning
Viaarxiv icon

Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models

Oct 09, 2019
Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller

Figure 1 for Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Figure 2 for Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Figure 3 for Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Figure 4 for Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Viaarxiv icon

A Generalized Training Approach for Multiagent Learning

Sep 27, 2019
Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Perolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Remi Munos

Figure 1 for A Generalized Training Approach for Multiagent Learning
Figure 2 for A Generalized Training Approach for Multiagent Learning
Figure 3 for A Generalized Training Approach for Multiagent Learning
Figure 4 for A Generalized Training Approach for Multiagent Learning
Viaarxiv icon

V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control

Sep 26, 2019
H. Francis Song, Abbas Abdolmaleki, Jost Tobias Springenberg, Aidan Clark, Hubert Soyer, Jack W. Rae, Seb Noury, Arun Ahuja, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Dan Belov, Martin Riedmiller, Matthew M. Botvinick

Figure 1 for V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Figure 2 for V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Figure 3 for V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Figure 4 for V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Viaarxiv icon

Regularized Hierarchical Policies for Compositional Transfer in Robotics

Jun 27, 2019
Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller

Figure 1 for Regularized Hierarchical Policies for Compositional Transfer in Robotics
Figure 2 for Regularized Hierarchical Policies for Compositional Transfer in Robotics
Figure 3 for Regularized Hierarchical Policies for Compositional Transfer in Robotics
Figure 4 for Regularized Hierarchical Policies for Compositional Transfer in Robotics
Viaarxiv icon

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Jun 14, 2019
Guy Lorberbom, Chris J. Maddison, Nicolas Heess, Tamir Hazan, Daniel Tarlow

Figure 1 for Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Figure 2 for Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Figure 3 for Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Viaarxiv icon

Meta reinforcement learning as task inference

May 15, 2019
Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh, Nicolas Heess

Figure 1 for Meta reinforcement learning as task inference
Figure 2 for Meta reinforcement learning as task inference
Figure 3 for Meta reinforcement learning as task inference
Figure 4 for Meta reinforcement learning as task inference
Viaarxiv icon

Meta-learning of Sequential Strategies

May 08, 2019
Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

Figure 1 for Meta-learning of Sequential Strategies
Figure 2 for Meta-learning of Sequential Strategies
Figure 3 for Meta-learning of Sequential Strategies
Figure 4 for Meta-learning of Sequential Strategies
Viaarxiv icon