Picture for Jacob Eisenstein

Jacob Eisenstein

Robust Preference Optimization through Reward Model Distillation

Add code
May 29, 2024
Viaarxiv icon

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

Add code
Apr 18, 2024
Figure 1 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 2 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 3 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 4 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Viaarxiv icon

Transforming and Combining Rewards for Aligning Large Language Models

Add code
Feb 01, 2024
Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Add code
Jan 03, 2024
Viaarxiv icon

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Add code
Dec 21, 2023
Viaarxiv icon

Selectively Answering Ambiguous Questions

Add code
May 24, 2023
Figure 1 for Selectively Answering Ambiguous Questions
Figure 2 for Selectively Answering Ambiguous Questions
Figure 3 for Selectively Answering Ambiguous Questions
Figure 4 for Selectively Answering Ambiguous Questions
Viaarxiv icon

MD3: The Multi-Dialect Dataset of Dialogues

Add code
May 19, 2023
Figure 1 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 2 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 3 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 4 for MD3: The Multi-Dialect Dataset of Dialogues
Viaarxiv icon

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Add code
Dec 15, 2022
Figure 1 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 2 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 3 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 4 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Viaarxiv icon

Dialect-robust Evaluation of Generated Text

Add code
Nov 02, 2022
Figure 1 for Dialect-robust Evaluation of Generated Text
Figure 2 for Dialect-robust Evaluation of Generated Text
Figure 3 for Dialect-robust Evaluation of Generated Text
Figure 4 for Dialect-robust Evaluation of Generated Text
Viaarxiv icon

Predicting Long-Term Citations from Short-Term Linguistic Influence

Add code
Oct 24, 2022
Figure 1 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 2 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 3 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 4 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Viaarxiv icon