Alert button
Picture for Dawn Song

Dawn Song

Alert button

Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation

Feb 19, 2024
Shiyang Lai, Yujin Potter, Junsol Kim, Richard Zhuang, Dawn Song, James Evans

Viaarxiv icon

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

Feb 12, 2024
Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

Viaarxiv icon

GRATH: Gradual Self-Truthifying for Large Language Models

Jan 31, 2024
Weixin Chen, Dawn Song, Bo Li

Viaarxiv icon

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Nov 25, 2023
Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

Figure 1 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 2 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 3 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 4 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Viaarxiv icon

Managing AI Risks in an Era of Rapid Progress

Oct 26, 2023
Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

Viaarxiv icon

Effective and Efficient Federated Tree Learning on Hybrid Data

Oct 18, 2023
Qinbin Li, Chulin Xie, Xiaojun Xu, Xiaoyuan Liu, Ce Zhang, Bo Li, Bingsheng He, Dawn Song

Figure 1 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 2 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 3 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 4 for Effective and Efficient Federated Tree Learning on Hybrid Data
Viaarxiv icon

Representation Engineering: A Top-Down Approach to AI Transparency

Oct 10, 2023
Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

Figure 1 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 2 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 3 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 4 for Representation Engineering: A Top-Down Approach to AI Transparency
Viaarxiv icon

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Oct 05, 2023
Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang

Figure 1 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 2 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 3 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 4 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Viaarxiv icon