Picture for Remi Munos

Remi Munos

INRIA Lille

Offline Regularised Reinforcement Learning for Large Language Models Alignment

Add code
May 29, 2024
Figure 1 for Offline Regularised Reinforcement Learning for Large Language Models Alignment
Figure 2 for Offline Regularised Reinforcement Learning for Large Language Models Alignment
Figure 3 for Offline Regularised Reinforcement Learning for Large Language Models Alignment
Figure 4 for Offline Regularised Reinforcement Learning for Large Language Models Alignment
Viaarxiv icon

Super-Exponential Regret for UCT, AlphaGo and Variants

Add code
May 07, 2024
Figure 1 for Super-Exponential Regret for UCT, AlphaGo and Variants
Viaarxiv icon

Human Alignment of Large Language Models through Online Preference Optimisation

Add code
Mar 13, 2024
Figure 1 for Human Alignment of Large Language Models through Online Preference Optimisation
Figure 2 for Human Alignment of Large Language Models through Online Preference Optimisation
Figure 3 for Human Alignment of Large Language Models through Online Preference Optimisation
Figure 4 for Human Alignment of Large Language Models through Online Preference Optimisation
Viaarxiv icon

Model-free Posterior Sampling via Learning Rate Randomization

Add code
Oct 27, 2023
Figure 1 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 2 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 3 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 4 for Model-free Posterior Sampling via Learning Rate Randomization
Viaarxiv icon

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Add code
May 02, 2023
Figure 1 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 2 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 3 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 4 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Viaarxiv icon

Fast Rates for Maximum Entropy Exploration

Add code
Mar 14, 2023
Figure 1 for Fast Rates for Maximum Entropy Exploration
Figure 2 for Fast Rates for Maximum Entropy Exploration
Figure 3 for Fast Rates for Maximum Entropy Exploration
Figure 4 for Fast Rates for Maximum Entropy Exploration
Viaarxiv icon

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

Add code
Sep 28, 2022
Figure 1 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 2 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 3 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 4 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Viaarxiv icon

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Add code
Jun 30, 2022
Figure 1 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 2 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 3 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 4 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Viaarxiv icon

Game Plan: What AI can do for Football, and What Football can do for AI

Add code
Nov 18, 2020
Figure 1 for Game Plan: What AI can do for Football, and What Football can do for AI
Figure 2 for Game Plan: What AI can do for Football, and What Football can do for AI
Figure 3 for Game Plan: What AI can do for Football, and What Football can do for AI
Figure 4 for Game Plan: What AI can do for Football, and What Football can do for AI
Viaarxiv icon

Navigating the Landscape of Games

Add code
May 04, 2020
Figure 1 for Navigating the Landscape of Games
Figure 2 for Navigating the Landscape of Games
Figure 3 for Navigating the Landscape of Games
Figure 4 for Navigating the Landscape of Games
Viaarxiv icon