Picture for Mohammad Gheshlaghi Azar

Mohammad Gheshlaghi Azar

Radboud University

Neural Predictive Belief Representations

Add code
Nov 15, 2018
Figure 1 for Neural Predictive Belief Representations
Figure 2 for Neural Predictive Belief Representations
Figure 3 for Neural Predictive Belief Representations
Figure 4 for Neural Predictive Belief Representations
Viaarxiv icon

The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning

Add code
Jun 19, 2018
Figure 1 for The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
Figure 2 for The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
Figure 3 for The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
Figure 4 for The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
Viaarxiv icon

Observe and Look Further: Achieving Consistent Performance on Atari

Add code
May 29, 2018
Figure 1 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 2 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 3 for Observe and Look Further: Achieving Consistent Performance on Atari
Figure 4 for Observe and Look Further: Achieving Consistent Performance on Atari
Viaarxiv icon

Noisy Networks for Exploration

Add code
Feb 15, 2018
Figure 1 for Noisy Networks for Exploration
Figure 2 for Noisy Networks for Exploration
Figure 3 for Noisy Networks for Exploration
Figure 4 for Noisy Networks for Exploration
Viaarxiv icon

Minimax Regret Bounds for Reinforcement Learning

Add code
Jul 01, 2017
Viaarxiv icon

Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes

Add code
Mar 03, 2016
Figure 1 for Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes
Figure 2 for Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes
Viaarxiv icon

Online Stochastic Optimization under Correlated Bandit Feedback

Add code
May 19, 2014
Figure 1 for Online Stochastic Optimization under Correlated Bandit Feedback
Figure 2 for Online Stochastic Optimization under Correlated Bandit Feedback
Figure 3 for Online Stochastic Optimization under Correlated Bandit Feedback
Viaarxiv icon

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Add code
Jul 25, 2013
Figure 1 for Sequential Transfer in Multi-armed Bandit with Finite Set of Models
Figure 2 for Sequential Transfer in Multi-armed Bandit with Finite Set of Models
Figure 3 for Sequential Transfer in Multi-armed Bandit with Finite Set of Models
Figure 4 for Sequential Transfer in Multi-armed Bandit with Finite Set of Models
Viaarxiv icon

Regret Bounds for Reinforcement Learning with Policy Advice

Add code
Jul 17, 2013
Figure 1 for Regret Bounds for Reinforcement Learning with Policy Advice
Viaarxiv icon

On the Sample Complexity of Reinforcement Learning with a Generative Model

Add code
Jun 27, 2012
Figure 1 for On the Sample Complexity of Reinforcement Learning with a Generative Model
Viaarxiv icon