Alert button
Picture for Ahmad Beirami

Ahmad Beirami

Alert button

Optimal Block-Level Draft Verification for Accelerating Speculative Decoding

Mar 15, 2024
Ziteng Sun, Jae Hun Ro, Ahmad Beirami, Ananda Theertha Suresh

Viaarxiv icon

Gradient-Based Language Model Red Teaming

Jan 30, 2024
Nevan Wichers, Carson Denison, Ahmad Beirami

Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Jan 03, 2024
Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

Viaarxiv icon

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Dec 21, 2023
Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

Viaarxiv icon

Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing

Dec 06, 2023
Lucas Monteiro Paes, Ananda Theertha Suresh, Alex Beutel, Flavio P. Calmon, Ahmad Beirami

Viaarxiv icon

FRAPPÉ: A Post-Processing Framework for Group Fairness Regularization

Dec 05, 2023
Alexandru Ţifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, Flavien Prost

Viaarxiv icon

Improving Robustness via Tilted Exponential Layer: A Communication-Theoretic Perspective

Nov 02, 2023
Bhagyashree Puranik, Ahmad Beirami, Yao Qin, Upamanyu Madhow

Viaarxiv icon

Controlled Decoding from Language Models

Oct 25, 2023
Sidharth Mudgal, Jong Lee, Harish Ganapathy, YaGuang Li, Tao Wang, Yanping Huang, Zhifeng Chen, Heng-Tze Cheng, Michael Collins, Trevor Strohman, Jilin Chen, Alex Beutel, Ahmad Beirami

Viaarxiv icon

Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning

Oct 25, 2023
Ananth Balashankar, Xiao Ma, Aradhana Sinha, Ahmad Beirami, Yao Qin, Jilin Chen, Alex Beutel

Viaarxiv icon

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

Oct 25, 2023
Aradhana Sinha, Ananth Balashankar, Ahmad Beirami, Thi Avrahami, Jilin Chen, Alex Beutel

Viaarxiv icon