Alert button
Picture for Pierre-Luc Bacon

Pierre-Luc Bacon

Alert button

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

Add code
Bookmark button
Alert button
Jun 06, 2021
Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, Pierre-Luc Bacon

Figure 1 for Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Figure 2 for Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Figure 3 for Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Figure 4 for Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Viaarxiv icon

An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 10, 2021
Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon

Viaarxiv icon

XLVIN: eXecuted Latent Value Iteration Nets

Add code
Bookmark button
Alert button
Oct 25, 2020
Andreea Deac, Petar Veličković, Ognjen Milinković, Pierre-Luc Bacon, Jian Tang, Mladen Nikolić

Figure 1 for XLVIN: eXecuted Latent Value Iteration Nets
Figure 2 for XLVIN: eXecuted Latent Value Iteration Nets
Figure 3 for XLVIN: eXecuted Latent Value Iteration Nets
Figure 4 for XLVIN: eXecuted Latent Value Iteration Nets
Viaarxiv icon

Graph neural induction of value iteration

Add code
Bookmark button
Alert button
Sep 26, 2020
Andreea Deac, Pierre-Luc Bacon, Jian Tang

Figure 1 for Graph neural induction of value iteration
Figure 2 for Graph neural induction of value iteration
Figure 3 for Graph neural induction of value iteration
Figure 4 for Graph neural induction of value iteration
Viaarxiv icon

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

Add code
Bookmark button
Alert button
Jul 06, 2020
Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau

Figure 1 for TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
Figure 2 for TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
Figure 3 for TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
Figure 4 for TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
Viaarxiv icon

Policy Evaluation Networks

Add code
Bookmark button
Alert button
Feb 26, 2020
Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

Figure 1 for Policy Evaluation Networks
Figure 2 for Policy Evaluation Networks
Figure 3 for Policy Evaluation Networks
Figure 4 for Policy Evaluation Networks
Viaarxiv icon

Options of Interest: Temporal Abstraction with Interest Functions

Add code
Bookmark button
Alert button
Jan 01, 2020
Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup

Figure 1 for Options of Interest: Temporal Abstraction with Interest Functions
Figure 2 for Options of Interest: Temporal Abstraction with Interest Functions
Figure 3 for Options of Interest: Temporal Abstraction with Interest Functions
Figure 4 for Options of Interest: Temporal Abstraction with Interest Functions
Viaarxiv icon

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

Add code
Bookmark button
Alert button
Dec 11, 2019
Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, Doina Precup

Figure 1 for Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Figure 2 for Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Figure 3 for Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Figure 4 for Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Viaarxiv icon

All-Action Policy Gradient Methods: A Numerical Integration Approach

Add code
Bookmark button
Alert button
Oct 21, 2019
Benjamin Petit, Loren Amdahl-Culleton, Yao Liu, Jimmy Smith, Pierre-Luc Bacon

Figure 1 for All-Action Policy Gradient Methods: A Numerical Integration Approach
Figure 2 for All-Action Policy Gradient Methods: A Numerical Integration Approach
Viaarxiv icon

Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling

Add code
Bookmark button
Alert button
Oct 15, 2019
Yao Liu, Pierre-Luc Bacon, Emma Brunskill

Figure 1 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Figure 2 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Figure 3 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Viaarxiv icon