Picture for Alex Beutel

Alex Beutel

Tony

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

Add code
Oct 25, 2023
Figure 1 for Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Figure 2 for Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Figure 3 for Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Figure 4 for Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Viaarxiv icon

Controlled Decoding from Language Models

Add code
Oct 25, 2023
Figure 1 for Controlled Decoding from Language Models
Figure 2 for Controlled Decoding from Language Models
Figure 3 for Controlled Decoding from Language Models
Figure 4 for Controlled Decoding from Language Models
Viaarxiv icon

Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting

Add code
Oct 25, 2023
Figure 1 for Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
Figure 2 for Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
Figure 3 for Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
Figure 4 for Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
Viaarxiv icon

Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning

Add code
Oct 25, 2023
Figure 1 for Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning
Figure 2 for Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning
Viaarxiv icon

Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders

Add code
Aug 23, 2023
Figure 1 for Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders
Figure 2 for Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders
Figure 3 for Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders
Viaarxiv icon

Towards A Scalable Solution for Improving Multi-Group Fairness in Compositional Classification

Add code
Jul 11, 2023
Viaarxiv icon

Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning

Add code
Jun 25, 2023
Viaarxiv icon

Improving Classifier Robustness through Active Generation of Pairwise Counterfactuals

Add code
May 22, 2023
Viaarxiv icon

Towards Robust Prompts on Vision-Language Models

Add code
Apr 17, 2023
Viaarxiv icon

What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel

Add code
Feb 22, 2023
Viaarxiv icon