Picture for Pierre-Luc Bacon

Pierre-Luc Bacon

Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models

Add code
Mar 21, 2026
Viaarxiv icon

What Makes Value Learning Efficient in Residual Reinforcement Learning?

Add code
Feb 11, 2026
Viaarxiv icon

Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity

Add code
Feb 03, 2026
Viaarxiv icon

The Three Regimes of Offline-to-Online Reinforcement Learning

Add code
Oct 01, 2025
Figure 1 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 2 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 3 for The Three Regimes of Offline-to-Online Reinforcement Learning
Figure 4 for The Three Regimes of Offline-to-Online Reinforcement Learning
Viaarxiv icon

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Add code
Jun 18, 2025
Figure 1 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 2 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 3 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 4 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Viaarxiv icon

State Entropy Regularization for Robust Reinforcement Learning

Add code
Jun 08, 2025
Figure 1 for State Entropy Regularization for Robust Reinforcement Learning
Figure 2 for State Entropy Regularization for Robust Reinforcement Learning
Figure 3 for State Entropy Regularization for Robust Reinforcement Learning
Figure 4 for State Entropy Regularization for Robust Reinforcement Learning
Viaarxiv icon

Mol-MoE: Training Preference-Guided Routers for Molecule Generation

Add code
Feb 08, 2025
Figure 1 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 2 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 3 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Figure 4 for Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Viaarxiv icon

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Add code
Dec 11, 2024
Figure 1 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 2 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 3 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Figure 4 for MaestroMotif: Skill Design from Artificial Intelligence Feedback
Viaarxiv icon

Exploring Scaling Trends in LLM Robustness

Add code
Jul 26, 2024
Figure 1 for Exploring Scaling Trends in LLM Robustness
Figure 2 for Exploring Scaling Trends in LLM Robustness
Figure 3 for Exploring Scaling Trends in LLM Robustness
Figure 4 for Exploring Scaling Trends in LLM Robustness
Viaarxiv icon

Decoupling regularization from the action space

Add code
Jun 10, 2024
Viaarxiv icon