Alert button
Picture for Dawn Song

Dawn Song

Alert button

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

Add code
Bookmark button
Alert button
Apr 03, 2024
Jiawei Zhang, Chejian Xu, Yu Gai, Freddy Lecue, Dawn Song, Bo Li

Viaarxiv icon

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Add code
Bookmark button
Alert button
Mar 19, 2024
Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li

Figure 1 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 2 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 3 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 4 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Viaarxiv icon

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Add code
Bookmark button
Alert button
Mar 18, 2024
Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

Viaarxiv icon

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

Add code
Bookmark button
Alert button
Mar 15, 2024
Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Song

Figure 1 for Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
Figure 2 for Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
Figure 3 for Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
Figure 4 for Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
Viaarxiv icon

On the Societal Impact of Open Foundation Models

Add code
Bookmark button
Alert button
Feb 27, 2024
Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan

Figure 1 for On the Societal Impact of Open Foundation Models
Figure 2 for On the Societal Impact of Open Foundation Models
Viaarxiv icon

Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation

Add code
Bookmark button
Alert button
Feb 19, 2024
Shiyang Lai, Yujin Potter, Junsol Kim, Richard Zhuang, Dawn Song, James Evans

Viaarxiv icon

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

Add code
Bookmark button
Alert button
Feb 12, 2024
Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

Viaarxiv icon

GRATH: Gradual Self-Truthifying for Large Language Models

Add code
Bookmark button
Alert button
Jan 31, 2024
Weixin Chen, Dawn Song, Bo Li

Viaarxiv icon

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Add code
Bookmark button
Alert button
Nov 25, 2023
Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

Figure 1 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 2 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 3 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 4 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Viaarxiv icon