Alert button
Picture for Erik Jones

Erik Jones

Alert button

Feedback Loops With Language Models Drive In-Context Reward Hacking

Add code
Bookmark button
Alert button
Feb 09, 2024
Alexander Pan, Erik Jones, Meena Jagadeesan, Jacob Steinhardt

Viaarxiv icon

Orca 2: Teaching Small Language Models How to Reason

Add code
Bookmark button
Alert button
Nov 21, 2023
Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

Viaarxiv icon

Teaching Language Models to Hallucinate Less with Synthetic Tasks

Add code
Bookmark button
Alert button
Oct 10, 2023
Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar

Figure 1 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 2 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 3 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 4 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Viaarxiv icon

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Add code
Bookmark button
Alert button
Sep 26, 2023
Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

Figure 1 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 2 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 3 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 4 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Viaarxiv icon

Mass-Producing Failures of Multimodal Systems with Language Models

Add code
Bookmark button
Alert button
Jun 21, 2023
Shengbang Tong, Erik Jones, Jacob Steinhardt

Figure 1 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 2 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 3 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 4 for Mass-Producing Failures of Multimodal Systems with Language Models
Viaarxiv icon

Automatically Auditing Large Language Models via Discrete Optimization

Add code
Bookmark button
Alert button
Mar 08, 2023
Erik Jones, Anca Dragan, Aditi Raghunathan, Jacob Steinhardt

Figure 1 for Automatically Auditing Large Language Models via Discrete Optimization
Figure 2 for Automatically Auditing Large Language Models via Discrete Optimization
Figure 3 for Automatically Auditing Large Language Models via Discrete Optimization
Figure 4 for Automatically Auditing Large Language Models via Discrete Optimization
Viaarxiv icon

Capturing Failures of Large Language Models via Human Cognitive Biases

Add code
Bookmark button
Alert button
Feb 24, 2022
Erik Jones, Jacob Steinhardt

Figure 1 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 2 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 3 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 4 for Capturing Failures of Large Language Models via Human Cognitive Biases
Viaarxiv icon

Selective Classification Can Magnify Disparities Across Groups

Add code
Bookmark button
Alert button
Oct 27, 2020
Erik Jones, Shiori Sagawa, Pang Wei Koh, Ananya Kumar, Percy Liang

Figure 1 for Selective Classification Can Magnify Disparities Across Groups
Figure 2 for Selective Classification Can Magnify Disparities Across Groups
Figure 3 for Selective Classification Can Magnify Disparities Across Groups
Figure 4 for Selective Classification Can Magnify Disparities Across Groups
Viaarxiv icon

Robust Encodings: A Framework for Combating Adversarial Typos

Add code
Bookmark button
Alert button
May 04, 2020
Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang

Figure 1 for Robust Encodings: A Framework for Combating Adversarial Typos
Figure 2 for Robust Encodings: A Framework for Combating Adversarial Typos
Figure 3 for Robust Encodings: A Framework for Combating Adversarial Typos
Figure 4 for Robust Encodings: A Framework for Combating Adversarial Typos
Viaarxiv icon