Alert button
Picture for Pierre-Luc Bacon

Pierre-Luc Bacon

Alert button

Do Transformer World Models Give Better Policy Gradients?

Feb 11, 2024
Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon

Viaarxiv icon

Bridging State and History Representations: Understanding Self-Predictive RL

Jan 17, 2024
Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon

Viaarxiv icon

Maximum entropy GFlowNets with soft Q-learning

Dec 21, 2023
Sobhan Mohammadpour, Emmanuel Bengio, Emma Frejinger, Pierre-Luc Bacon

Viaarxiv icon

Course Correcting Koopman Representations

Oct 23, 2023
Mahan Fathi, Clement Gehring, Jonathan Pilault, David Kanaa, Pierre-Luc Bacon, Ross Goroshin

Viaarxiv icon

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

Sep 29, 2023
Martin Klissarov, Pierluca D'Oro, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff

Figure 1 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 2 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 3 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 4 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Viaarxiv icon

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

Sep 26, 2023
Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

Figure 1 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 2 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 3 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 4 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Viaarxiv icon

When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment

Jul 31, 2023
Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon

Figure 1 for When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Figure 2 for When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Figure 3 for When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Figure 4 for When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Viaarxiv icon

Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design

Jun 29, 2023
Julien Roy, Pierre-Luc Bacon, Christopher Pal, Emmanuel Bengio

Figure 1 for Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design
Figure 2 for Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design
Figure 3 for Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design
Figure 4 for Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design
Viaarxiv icon