Alert button
Picture for Jacob Eisenstein

Jacob Eisenstein

Alert button

Transforming and Combining Rewards for Aligning Large Language Models

Feb 01, 2024
Zihao Wang, Chirag Nagpal, Jonathan Berant, Jacob Eisenstein, Alex D'Amour, Sanmi Koyejo, Victor Veitch

Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Jan 03, 2024
Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

Viaarxiv icon

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Dec 21, 2023
Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

Viaarxiv icon

Selectively Answering Ambiguous Questions

May 24, 2023
Jeremy R. Cole, Michael J. Q. Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein

Figure 1 for Selectively Answering Ambiguous Questions
Figure 2 for Selectively Answering Ambiguous Questions
Figure 3 for Selectively Answering Ambiguous Questions
Figure 4 for Selectively Answering Ambiguous Questions
Viaarxiv icon

MD3: The Multi-Dialect Dataset of Dialogues

May 19, 2023
Jacob Eisenstein, Vinodkumar Prabhakaran, Clara Rivera, Dorottya Demszky, Devyani Sharma

Figure 1 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 2 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 3 for MD3: The Multi-Dialect Dataset of Dialogues
Figure 4 for MD3: The Multi-Dialect Dataset of Dialogues
Viaarxiv icon

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Dec 15, 2022
Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster

Figure 1 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 2 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 3 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 4 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Viaarxiv icon

Dialect-robust Evaluation of Generated Text

Nov 02, 2022
Jiao Sun, Thibault Sellam, Elizabeth Clark, Tu Vu, Timothy Dozat, Dan Garrette, Aditya Siddhant, Jacob Eisenstein, Sebastian Gehrmann

Figure 1 for Dialect-robust Evaluation of Generated Text
Figure 2 for Dialect-robust Evaluation of Generated Text
Figure 3 for Dialect-robust Evaluation of Generated Text
Figure 4 for Dialect-robust Evaluation of Generated Text
Viaarxiv icon

Predicting Long-Term Citations from Short-Term Linguistic Influence

Oct 24, 2022
Sandeep Soni, David Bamman, Jacob Eisenstein

Figure 1 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 2 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 3 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Figure 4 for Predicting Long-Term Citations from Short-Term Linguistic Influence
Viaarxiv icon

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

Oct 05, 2022
Jacob Eisenstein, Daniel Andor, Bernd Bohnet, Michael Collins, David Mimno

Figure 1 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 2 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 3 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Figure 4 for Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Viaarxiv icon

Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language

Apr 09, 2022
Jacob Eisenstein

Figure 1 for Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language
Figure 2 for Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language
Viaarxiv icon