Picture for Pierre-Luc Bacon

Pierre-Luc Bacon

The Three Regimes of Offline-to-Online Reinforcement Learning

Add code
Oct 01, 2025
Figure 1 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 2 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 3 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 4 for The Three Regimes of Offline-to-Online Reinforcement Learning
Viaarxiv icon

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Add code
Jun 18, 2025
Figure 1 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 2 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 3 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 4 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Viaarxiv icon

State Entropy Regularization for Robust Reinforcement Learning

Add code
Jun 08, 2025
Viaarxiv icon

Mol-MoE: Training Preference-Guided Routers for Molecule Generation

Add code
Feb 08, 2025
Figure 1 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 2 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 3 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 4 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Viaarxiv icon

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Add code
Dec 11, 2024
Figure 1 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 2 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 3 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 4 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Viaarxiv icon

Exploring Scaling Trends in LLM Robustness

Add code
Jul 26, 2024
Figure 1 for Exploring Scaling Trends in LLM Robustness
Figure 2 for Exploring Scaling Trends in LLM Robustness
Figure 3 for Exploring Scaling Trends in LLM Robustness
Figure 4 for Exploring Scaling Trends in LLM Robustness
Viaarxiv icon

Decoupling regularization from the action space

Add code
Jun 10, 2024
Viaarxiv icon

Generative Active Learning for the Search of Small-molecule Protein Binders

Add code
May 02, 2024
Figure 1 for Generative Active Learning for the Search of Small-molecule Protein Binders
Figure 2 for Generative Active Learning for the Search of Small-molecule Protein Binders
Figure 3 for Generative Active Learning for the Search of Small-molecule Protein Binders
Figure 4 for Generative Active Learning for the Search of Small-molecule Protein Binders
Viaarxiv icon

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons

Add code
Mar 12, 2024
Figure 1 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 2 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 3 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 4 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Viaarxiv icon

Do Transformer World Models Give Better Policy Gradients?

Add code
Feb 11, 2024
Figure 1 for Do Transformer World Models Give Better Policy Gradients?
Figure 2 for Do Transformer World Models Give Better Policy Gradients?
Figure 3 for Do Transformer World Models Give Better Policy Gradients?
Figure 4 for Do Transformer World Models Give Better Policy Gradients?
Viaarxiv icon