Picture for Mohammad Ghavamzadeh

Mohammad Ghavamzadeh

INRIA Lille - Nord Europe

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

Add code
Jun 06, 2020
Figure 1 for Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
Figure 2 for Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
Figure 3 for Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
Figure 4 for Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
Viaarxiv icon

Automatic Policy Synthesis to Improve the Safety of Nonlinear Dynamical Systems

Add code
Jun 06, 2020
Figure 1 for Automatic Policy Synthesis to Improve the Safety of Nonlinear Dynamical Systems
Figure 2 for Automatic Policy Synthesis to Improve the Safety of Nonlinear Dynamical Systems
Figure 3 for Automatic Policy Synthesis to Improve the Safety of Nonlinear Dynamical Systems
Figure 4 for Automatic Policy Synthesis to Improve the Safety of Nonlinear Dynamical Systems
Viaarxiv icon

Active Model Estimation in Markov Decision Processes

Add code
Mar 06, 2020
Figure 1 for Active Model Estimation in Markov Decision Processes
Figure 2 for Active Model Estimation in Markov Decision Processes
Figure 3 for Active Model Estimation in Markov Decision Processes
Figure 4 for Active Model Estimation in Markov Decision Processes
Viaarxiv icon

Predictive Coding for Locally-Linear Control

Add code
Mar 02, 2020
Figure 1 for Predictive Coding for Locally-Linear Control
Figure 2 for Predictive Coding for Locally-Linear Control
Figure 3 for Predictive Coding for Locally-Linear Control
Figure 4 for Predictive Coding for Locally-Linear Control
Viaarxiv icon

Policy-Aware Model Learning for Policy Gradient Methods

Add code
Feb 28, 2020
Figure 1 for Policy-Aware Model Learning for Policy Gradient Methods
Figure 2 for Policy-Aware Model Learning for Policy Gradient Methods
Figure 3 for Policy-Aware Model Learning for Policy Gradient Methods
Figure 4 for Policy-Aware Model Learning for Policy Gradient Methods
Viaarxiv icon

Improved Algorithms for Conservative Exploration in Bandits

Add code
Feb 08, 2020
Figure 1 for Improved Algorithms for Conservative Exploration in Bandits
Figure 2 for Improved Algorithms for Conservative Exploration in Bandits
Figure 3 for Improved Algorithms for Conservative Exploration in Bandits
Figure 4 for Improved Algorithms for Conservative Exploration in Bandits
Viaarxiv icon

Conservative Exploration in Reinforcement Learning

Add code
Feb 08, 2020
Figure 1 for Conservative Exploration in Reinforcement Learning
Figure 2 for Conservative Exploration in Reinforcement Learning
Figure 3 for Conservative Exploration in Reinforcement Learning
Figure 4 for Conservative Exploration in Reinforcement Learning
Viaarxiv icon

Adaptive Sampling for Estimating Multiple Probability Distributions

Add code
Dec 07, 2019
Figure 1 for Adaptive Sampling for Estimating Multiple Probability Distributions
Figure 2 for Adaptive Sampling for Estimating Multiple Probability Distributions
Figure 3 for Adaptive Sampling for Estimating Multiple Probability Distributions
Figure 4 for Adaptive Sampling for Estimating Multiple Probability Distributions
Viaarxiv icon

Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning

Add code
Oct 14, 2019
Figure 1 for Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
Figure 2 for Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
Figure 3 for Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
Figure 4 for Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
Viaarxiv icon

Benchmarking Batch Deep Reinforcement Learning Algorithms

Add code
Oct 03, 2019
Figure 1 for Benchmarking Batch Deep Reinforcement Learning Algorithms
Viaarxiv icon