Picture for Florian Strub

Florian Strub

TSP, IP Paris, SAMOVAR

Averaging log-likelihoods in direct alignment

Add code
Jun 27, 2024
Viaarxiv icon

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Add code
Jun 27, 2024
Viaarxiv icon

Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning

Add code
Apr 30, 2024
Viaarxiv icon

Language Evolution with Deep Learning

Add code
Mar 18, 2024
Figure 1 for Language Evolution with Deep Learning
Figure 2 for Language Evolution with Deep Learning
Figure 3 for Language Evolution with Deep Learning
Figure 4 for Language Evolution with Deep Learning
Viaarxiv icon

Language Model Alignment with Elastic Reset

Add code
Dec 06, 2023
Viaarxiv icon

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Add code
Feb 09, 2023
Figure 1 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 2 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 3 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 4 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Viaarxiv icon

SemPPL: Predicting pseudo-labels for better contrastive representations

Add code
Jan 12, 2023
Figure 1 for SemPPL: Predicting pseudo-labels for better contrastive representations
Figure 2 for SemPPL: Predicting pseudo-labels for better contrastive representations
Figure 3 for SemPPL: Predicting pseudo-labels for better contrastive representations
Figure 4 for SemPPL: Predicting pseudo-labels for better contrastive representations
Viaarxiv icon

Over-communicate no more: Situated RL agents learn concise communication protocols

Add code
Nov 02, 2022
Figure 1 for Over-communicate no more: Situated RL agents learn concise communication protocols
Figure 2 for Over-communicate no more: Situated RL agents learn concise communication protocols
Figure 3 for Over-communicate no more: Situated RL agents learn concise communication protocols
Figure 4 for Over-communicate no more: Situated RL agents learn concise communication protocols
Viaarxiv icon

Emergent Communication: Generalization and Overfitting in Lewis Games

Add code
Sep 30, 2022
Figure 1 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 2 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 3 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 4 for Emergent Communication: Generalization and Overfitting in Lewis Games
Viaarxiv icon

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

Add code
Sep 22, 2022
Viaarxiv icon