Picture for Eric Mitchell

Eric Mitchell

Tony

Identifying and Mitigating the Security Risks of Generative AI

Add code
Aug 28, 2023
Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Add code
May 29, 2023
Figure 1 for Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Figure 2 for Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Figure 3 for Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Figure 4 for Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Viaarxiv icon

Meta-Learning Online Adaptation of Language Models

Add code
May 24, 2023
Viaarxiv icon

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

Add code
May 24, 2023
Figure 1 for Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Figure 2 for Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Figure 3 for Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Figure 4 for Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Viaarxiv icon

RECKONING: Reasoning through Dynamic Knowledge Encoding

Add code
May 23, 2023
Figure 1 for RECKONING: Reasoning through Dynamic Knowledge Encoding
Figure 2 for RECKONING: Reasoning through Dynamic Knowledge Encoding
Figure 3 for RECKONING: Reasoning through Dynamic Knowledge Encoding
Figure 4 for RECKONING: Reasoning through Dynamic Knowledge Encoding
Viaarxiv icon

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Add code
Jan 26, 2023
Viaarxiv icon

Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models

Add code
Nov 27, 2022
Viaarxiv icon

Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference

Add code
Nov 21, 2022
Figure 1 for Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
Figure 2 for Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
Figure 3 for Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
Figure 4 for Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
Viaarxiv icon

Memory-Based Model Editing at Scale

Add code
Jun 13, 2022
Figure 1 for Memory-Based Model Editing at Scale
Figure 2 for Memory-Based Model Editing at Scale
Figure 3 for Memory-Based Model Editing at Scale
Figure 4 for Memory-Based Model Editing at Scale
Viaarxiv icon

Fast Model Editing at Scale

Add code
Oct 21, 2021
Figure 1 for Fast Model Editing at Scale
Figure 2 for Fast Model Editing at Scale
Figure 3 for Fast Model Editing at Scale
Figure 4 for Fast Model Editing at Scale
Viaarxiv icon