Alert button
Picture for Peter Henderson

Peter Henderson

Alert button

FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning

Add code
Bookmark button
Alert button
Apr 02, 2024
Joel Niklaus, Lucia Zheng, Arya D. McCarthy, Christopher Hahn, Brian M. Rosen, Peter Henderson, Daniel E. Ho, Garrett Honke, Percy Liang, Christopher Manning

Viaarxiv icon

What's in Your "Safe" Data?: Identifying Benign Data that Breaks Safety

Add code
Bookmark button
Alert button
Apr 01, 2024
Luxi He, Mengzhou Xia, Peter Henderson

Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Bookmark button
Alert button
Mar 07, 2024
Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

On the Societal Impact of Open Foundation Models

Add code
Bookmark button
Alert button
Feb 27, 2024
Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan

Figure 1 for On the Societal Impact of Open Foundation Models
Figure 2 for On the Societal Impact of Open Foundation Models
Viaarxiv icon

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Add code
Bookmark button
Alert button
Feb 07, 2024
Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

Viaarxiv icon

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

Add code
Bookmark button
Alert button
Oct 05, 2023
Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson

Figure 1 for Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Figure 2 for Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Figure 3 for Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Figure 4 for Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Viaarxiv icon

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Add code
Bookmark button
Alert button
Aug 20, 2023
Neel Guha, Julian Nyarko, Daniel E. Ho, Christopher Ré, Adam Chilton, Aditya Narayana, Alex Chohlas-Wood, Austin Peters, Brandon Waldon, Daniel N. Rockmore, Diego Zambrano, Dmitry Talisman, Enam Hoque, Faiz Surani, Frank Fagan, Galit Sarfaty, Gregory M. Dickinson, Haggai Porat, Jason Hegland, Jessica Wu, Joe Nudell, Joel Niklaus, John Nay, Jonathan H. Choi, Kevin Tobia, Margaret Hagan, Megan Ma, Michael Livermore, Nikon Rasumov-Rahe, Nils Holzenberger, Noam Kolt, Peter Henderson, Sean Rehaag, Sharad Goel, Shang Gao, Spencer Williams, Sunny Gandhi, Tom Zur, Varun Iyer, Zehua Li

Figure 1 for LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Figure 2 for LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Figure 3 for LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Figure 4 for LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Viaarxiv icon

Where's the Liability in Harmful AI Speech?

Add code
Bookmark button
Alert button
Aug 16, 2023
Peter Henderson, Tatsunori Hashimoto, Mark Lemley

Figure 1 for Where's the Liability in Harmful AI Speech?
Figure 2 for Where's the Liability in Harmful AI Speech?
Figure 3 for Where's the Liability in Harmful AI Speech?
Figure 4 for Where's the Liability in Harmful AI Speech?
Viaarxiv icon

Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs

Add code
Bookmark button
Alert button
May 03, 2023
Deepak Narayanan, Keshav Santhanam, Peter Henderson, Rishi Bommasani, Tony Lee, Percy Liang

Figure 1 for Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Figure 2 for Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Figure 3 for Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Figure 4 for Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Viaarxiv icon