Picture for Dawn Song

Dawn Song

University of California, Berkeley

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Add code
Mar 18, 2024
Figure 1 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 2 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 3 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 4 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Viaarxiv icon

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

Add code
Mar 15, 2024
Viaarxiv icon

On the Societal Impact of Open Foundation Models

Add code
Feb 27, 2024
Figure 1 for On the Societal Impact of Open Foundation Models
Figure 2 for On the Societal Impact of Open Foundation Models
Viaarxiv icon

Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation

Add code
Feb 19, 2024
Figure 1 for Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation
Figure 2 for Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation
Figure 3 for Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation
Figure 4 for Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation
Viaarxiv icon

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

Add code
Feb 12, 2024
Viaarxiv icon

GRATH: Gradual Self-Truthifying for Large Language Models

Add code
Jan 31, 2024
Viaarxiv icon

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Add code
Nov 25, 2023
Figure 1 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 2 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 3 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Figure 4 for TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Viaarxiv icon

Managing AI Risks in an Era of Rapid Progress

Add code
Oct 26, 2023
Viaarxiv icon

Effective and Efficient Federated Tree Learning on Hybrid Data

Add code
Oct 18, 2023
Figure 1 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 2 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 3 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 4 for Effective and Efficient Federated Tree Learning on Hybrid Data
Viaarxiv icon

Representation Engineering: A Top-Down Approach to AI Transparency

Add code
Oct 10, 2023
Figure 1 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 2 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 3 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 4 for Representation Engineering: A Top-Down Approach to AI Transparency
Viaarxiv icon