Alert button
Picture for Zidi Xiong

Zidi Xiong

Alert button

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Add code
Bookmark button
Alert button
Mar 19, 2024
Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li

Figure 1 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 2 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 3 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Figure 4 for RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Viaarxiv icon

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Add code
Bookmark button
Alert button
Jan 20, 2024
Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

Viaarxiv icon

CBD: A Certified Backdoor Detector Based on Local Dominant Probability

Add code
Bookmark button
Alert button
Oct 26, 2023
Zhen Xiang, Zidi Xiong, Bo Li

Viaarxiv icon

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Add code
Bookmark button
Alert button
Jun 20, 2023
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Figure 1 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 2 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 3 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 4 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Viaarxiv icon

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

Add code
Bookmark button
Alert button
Jun 02, 2023
Zhen Xiang, Zidi Xiong, Bo Li

Figure 1 for UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Figure 2 for UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Figure 3 for UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Figure 4 for UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Viaarxiv icon

Label-Smoothed Backdoor Attack

Add code
Bookmark button
Alert button
Feb 19, 2022
Minlong Peng, Zidi Xiong, Mingming Sun, Ping Li

Figure 1 for Label-Smoothed Backdoor Attack
Figure 2 for Label-Smoothed Backdoor Attack
Figure 3 for Label-Smoothed Backdoor Attack
Figure 4 for Label-Smoothed Backdoor Attack
Viaarxiv icon