Alert button
Picture for Yangsibo Huang

Yangsibo Huang

Alert button

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Bookmark button
Alert button
Mar 07, 2024
Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Add code
Bookmark button
Alert button
Feb 07, 2024
Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

Viaarxiv icon

Sparsity-Preserving Differentially Private Training of Large Embedding Models

Add code
Bookmark button
Alert button
Nov 14, 2023
Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

Figure 1 for Sparsity-Preserving Differentially Private Training of Large Embedding Models
Figure 2 for Sparsity-Preserving Differentially Private Training of Large Embedding Models
Figure 3 for Sparsity-Preserving Differentially Private Training of Large Embedding Models
Figure 4 for Sparsity-Preserving Differentially Private Training of Large Embedding Models
Viaarxiv icon

Detecting Pretraining Data from Large Language Models

Add code
Bookmark button
Alert button
Nov 03, 2023
Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer

Figure 1 for Detecting Pretraining Data from Large Language Models
Figure 2 for Detecting Pretraining Data from Large Language Models
Figure 3 for Detecting Pretraining Data from Large Language Models
Figure 4 for Detecting Pretraining Data from Large Language Models
Viaarxiv icon

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

Add code
Bookmark button
Alert button
Oct 10, 2023
Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen

Figure 1 for Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Figure 2 for Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Figure 3 for Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Figure 4 for Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Viaarxiv icon

Learning across Data Owners with Joint Differential Privacy

Add code
Bookmark button
Alert button
May 25, 2023
Yangsibo Huang, Haotian Jiang, Daogao Liu, Mohammad Mahdian, Jieming Mao, Vahab Mirrokni

Figure 1 for Learning across Data Owners with Joint Differential Privacy
Figure 2 for Learning across Data Owners with Joint Differential Privacy
Figure 3 for Learning across Data Owners with Joint Differential Privacy
Figure 4 for Learning across Data Owners with Joint Differential Privacy
Viaarxiv icon

Privacy Implications of Retrieval-Based Language Models

Add code
Bookmark button
Alert button
May 24, 2023
Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen

Figure 1 for Privacy Implications of Retrieval-Based Language Models
Figure 2 for Privacy Implications of Retrieval-Based Language Models
Figure 3 for Privacy Implications of Retrieval-Based Language Models
Figure 4 for Privacy Implications of Retrieval-Based Language Models
Viaarxiv icon

Matching-based Data Valuation for Generative Model

Add code
Bookmark button
Alert button
Apr 21, 2023
Jiaxi Yang, Wenglong Deng, Benlin Liu, Yangsibo Huang, Xiaoxiao Li

Figure 1 for Matching-based Data Valuation for Generative Model
Figure 2 for Matching-based Data Valuation for Generative Model
Figure 3 for Matching-based Data Valuation for Generative Model
Figure 4 for Matching-based Data Valuation for Generative Model
Viaarxiv icon

$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

Add code
Bookmark button
Alert button
Feb 21, 2023
Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee

Figure 1 for $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Figure 2 for $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Figure 3 for $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Figure 4 for $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Viaarxiv icon