Picture for Florian Tramèr

Florian Tramèr

Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

Add code
Sep 29, 2024
Figure 1 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 2 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 3 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 4 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Viaarxiv icon

An Adversarial Perspective on Machine Unlearning for AI Safety

Add code
Sep 26, 2024
Viaarxiv icon

Extracting Training Data from Document-Based VQA Models

Add code
Jul 11, 2024
Viaarxiv icon

Adversarial Search Engine Optimization for Large Language Models

Add code
Jun 26, 2024
Figure 1 for Adversarial Search Engine Optimization for Large Language Models
Figure 2 for Adversarial Search Engine Optimization for Large Language Models
Figure 3 for Adversarial Search Engine Optimization for Large Language Models
Figure 4 for Adversarial Search Engine Optimization for Large Language Models
Viaarxiv icon

Blind Baselines Beat Membership Inference Attacks for Foundation Models

Add code
Jun 23, 2024
Viaarxiv icon

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Add code
Jun 19, 2024
Viaarxiv icon

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Add code
Jun 12, 2024
Figure 1 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 2 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 3 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 4 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Viaarxiv icon

Evaluations of Machine Learning Privacy Defenses are Misleading

Add code
Apr 26, 2024
Viaarxiv icon

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Add code
Apr 22, 2024
Viaarxiv icon

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

Add code
Mar 30, 2024
Viaarxiv icon