Picture for Thomas Mesnard

Thomas Mesnard

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Add code
Apr 11, 2024
Figure 1 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 2 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 3 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 4 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Viaarxiv icon

Gemma: Open Models Based on Gemini Research and Technology

Add code
Mar 13, 2024
Figure 1 for Gemma: Open Models Based on Gemini Research and Technology
Figure 2 for Gemma: Open Models Based on Gemini Research and Technology
Figure 3 for Gemma: Open Models Based on Gemini Research and Technology
Figure 4 for Gemma: Open Models Based on Gemini Research and Technology
Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Feb 07, 2024
Figure 1 for Direct Language Model Alignment from Online AI Feedback
Figure 2 for Direct Language Model Alignment from Online AI Feedback
Figure 3 for Direct Language Model Alignment from Online AI Feedback
Figure 4 for Direct Language Model Alignment from Online AI Feedback
Viaarxiv icon

Nash Learning from Human Feedback

Add code
Dec 06, 2023
Figure 1 for Nash Learning from Human Feedback
Figure 2 for Nash Learning from Human Feedback
Figure 3 for Nash Learning from Human Feedback
Figure 4 for Nash Learning from Human Feedback
Viaarxiv icon

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Add code
Dec 02, 2023
Figure 1 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 2 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 3 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 4 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Viaarxiv icon

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Add code
Sep 01, 2023
Figure 1 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 2 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 3 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 4 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Viaarxiv icon

Curiosity in hindsight

Add code
Nov 18, 2022
Figure 1 for Curiosity in hindsight
Figure 2 for Curiosity in hindsight
Figure 3 for Curiosity in hindsight
Figure 4 for Curiosity in hindsight
Viaarxiv icon

Geometric Entropic Exploration

Add code
Jan 07, 2021
Figure 1 for Geometric Entropic Exploration
Figure 2 for Geometric Entropic Exploration
Figure 3 for Geometric Entropic Exploration
Figure 4 for Geometric Entropic Exploration
Viaarxiv icon

Counterfactual Credit Assignment in Model-Free Reinforcement Learning

Add code
Nov 18, 2020
Figure 1 for Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Figure 2 for Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Figure 3 for Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Figure 4 for Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Viaarxiv icon

Hindsight Credit Assignment

Add code
Dec 05, 2019
Figure 1 for Hindsight Credit Assignment
Figure 2 for Hindsight Credit Assignment
Figure 3 for Hindsight Credit Assignment
Figure 4 for Hindsight Credit Assignment
Viaarxiv icon