Alert button
Picture for Dawn Song

Dawn Song

Alert button

Effective and Efficient Federated Tree Learning on Hybrid Data

Oct 18, 2023
Qinbin Li, Chulin Xie, Xiaojun Xu, Xiaoyuan Liu, Ce Zhang, Bo Li, Bingsheng He, Dawn Song

Figure 1 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 2 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 3 for Effective and Efficient Federated Tree Learning on Hybrid Data
Figure 4 for Effective and Efficient Federated Tree Learning on Hybrid Data
Viaarxiv icon

Representation Engineering: A Top-Down Approach to AI Transparency

Oct 10, 2023
Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

Figure 1 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 2 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 3 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 4 for Representation Engineering: A Top-Down Approach to AI Transparency
Viaarxiv icon

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Oct 05, 2023
Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang

Figure 1 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 2 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 3 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Figure 4 for Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Aug 28, 2023
Clark Barrett, Brad Boyd, Ellie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

SoK: Privacy-Preserving Data Synthesis

Jul 05, 2023
Yuzheng Hu, Fan Wu, Qinbin Li, Yunhui Long, Gonzalo Munilla Garrido, Chang Ge, Bolin Ding, David Forsyth, Bo Li, Dawn Song

Figure 1 for SoK: Privacy-Preserving Data Synthesis
Figure 2 for SoK: Privacy-Preserving Data Synthesis
Figure 3 for SoK: Privacy-Preserving Data Synthesis
Figure 4 for SoK: Privacy-Preserving Data Synthesis
Viaarxiv icon

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Jun 20, 2023
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Figure 1 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 2 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 3 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 4 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Viaarxiv icon

The False Promise of Imitating Proprietary LLMs

May 25, 2023
Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, Dawn Song

Figure 1 for The False Promise of Imitating Proprietary LLMs
Figure 2 for The False Promise of Imitating Proprietary LLMs
Figure 3 for The False Promise of Imitating Proprietary LLMs
Figure 4 for The False Promise of Imitating Proprietary LLMs
Viaarxiv icon

Blockchain Large Language Models

Apr 29, 2023
Yu Gai, Liyi Zhou, Kaihua Qin, Dawn Song, Arthur Gervais

Figure 1 for Blockchain Large Language Models
Figure 2 for Blockchain Large Language Models
Figure 3 for Blockchain Large Language Models
Figure 4 for Blockchain Large Language Models
Viaarxiv icon

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

Mar 10, 2023
Weixin Chen, Dawn Song, Bo Li

Figure 1 for TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Figure 2 for TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Figure 3 for TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Figure 4 for TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Viaarxiv icon