Picture for Johan Ferret

Johan Ferret

WARP: On the Benefits of Weight Averaged Rewarded Policies

Add code
Jun 24, 2024
Viaarxiv icon

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Add code
Apr 11, 2024
Figure 1 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 2 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 3 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 4 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Viaarxiv icon

Gemma: Open Models Based on Gemini Research and Technology

Add code
Mar 13, 2024
Figure 1 for Gemma: Open Models Based on Gemini Research and Technology
Figure 2 for Gemma: Open Models Based on Gemini Research and Technology
Figure 3 for Gemma: Open Models Based on Gemini Research and Technology
Figure 4 for Gemma: Open Models Based on Gemini Research and Technology
Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Feb 07, 2024
Figure 1 for Direct Language Model Alignment from Online AI Feedback
Figure 2 for Direct Language Model Alignment from Online AI Feedback
Figure 3 for Direct Language Model Alignment from Online AI Feedback
Figure 4 for Direct Language Model Alignment from Online AI Feedback
Viaarxiv icon

WARM: On the Benefits of Weight Averaged Reward Models

Add code
Jan 22, 2024
Figure 1 for WARM: On the Benefits of Weight Averaged Reward Models
Figure 2 for WARM: On the Benefits of Weight Averaged Reward Models
Figure 3 for WARM: On the Benefits of Weight Averaged Reward Models
Figure 4 for WARM: On the Benefits of Weight Averaged Reward Models
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Add code
Dec 02, 2023
Figure 1 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 2 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 3 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Figure 4 for A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
Viaarxiv icon

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Add code
May 31, 2023
Figure 1 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 2 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 3 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 4 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Viaarxiv icon

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act

Add code
Mar 16, 2022
Figure 1 for Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Figure 2 for Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Figure 3 for Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Figure 4 for Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Viaarxiv icon

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences

Add code
Nov 07, 2021
Figure 1 for More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
Figure 2 for More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
Figure 3 for More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
Figure 4 for More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
Viaarxiv icon