Alert button
Picture for Michael Backes

Michael Backes

Alert button

Rapid Adoption, Hidden Risks: The Dual Impact of Large Language Model Customization

Add code
Bookmark button
Alert button
Feb 15, 2024
Rui Zhang, Hongwei Li, Rui Wen, Wenbo Jiang, Yuan Zhang, Michael Backes, Yun Shen, Yang Zhang

Viaarxiv icon

Comprehensive Assessment of Jailbreak Attacks Against LLMs

Add code
Bookmark button
Alert button
Feb 08, 2024
Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang

Viaarxiv icon

Conversation Reconstruction Attack Against GPT Models

Add code
Bookmark button
Alert button
Feb 05, 2024
Junjie Chu, Zeyang Sha, Michael Backes, Yang Zhang

Viaarxiv icon

TrustLLM: Trustworthiness in Large Language Models

Add code
Bookmark button
Alert button
Jan 25, 2024
Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

Figure 1 for TrustLLM: Trustworthiness in Large Language Models
Figure 2 for TrustLLM: Trustworthiness in Large Language Models
Figure 3 for TrustLLM: Trustworthiness in Large Language Models
Figure 4 for TrustLLM: Trustworthiness in Large Language Models
Viaarxiv icon

Memorization in Self-Supervised Learning Improves Downstream Generalization

Add code
Bookmark button
Alert button
Jan 24, 2024
Wenhao Wang, Muhammad Ahmad Kaleem, Adam Dziedzic, Michael Backes, Nicolas Papernot, Franziska Boenisch

Viaarxiv icon

FAKEPCD: Fake Point Cloud Detection via Source Attribution

Add code
Bookmark button
Alert button
Dec 18, 2023
Yiting Qu, Zhikun Zhang, Yun Shen, Michael Backes, Yang Zhang

Viaarxiv icon

Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models

Add code
Bookmark button
Alert button
Oct 30, 2023
Minxing Zhang, Ning Yu, Rui Wen, Michael Backes, Yang Zhang

Viaarxiv icon

SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

Add code
Bookmark button
Alert button
Oct 19, 2023
Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang

Viaarxiv icon

Revisiting Transferable Adversarial Image Examples: Attack Categorization, Evaluation Guidelines, and New Insights

Add code
Bookmark button
Alert button
Oct 18, 2023
Zhengyu Zhao, Hanwei Zhang, Renjue Li, Ronan Sicre, Laurent Amsaleg, Michael Backes, Qi Li, Chao Shen

Viaarxiv icon