Picture for Florian Tramèr

Florian Tramèr

Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

Add code
Sep 29, 2024
Figure 1 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 2 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 3 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Figure 4 for Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Viaarxiv icon

An Adversarial Perspective on Machine Unlearning for AI Safety

Add code
Sep 26, 2024
Figure 1 for An Adversarial Perspective on Machine Unlearning for AI Safety
Figure 2 for An Adversarial Perspective on Machine Unlearning for AI Safety
Figure 3 for An Adversarial Perspective on Machine Unlearning for AI Safety
Figure 4 for An Adversarial Perspective on Machine Unlearning for AI Safety
Viaarxiv icon

Extracting Training Data from Document-Based VQA Models

Add code
Jul 11, 2024
Figure 1 for Extracting Training Data from Document-Based VQA Models
Figure 2 for Extracting Training Data from Document-Based VQA Models
Figure 3 for Extracting Training Data from Document-Based VQA Models
Figure 4 for Extracting Training Data from Document-Based VQA Models
Viaarxiv icon

Adversarial Search Engine Optimization for Large Language Models

Add code
Jun 26, 2024
Figure 1 for Adversarial Search Engine Optimization for Large Language Models
Figure 2 for Adversarial Search Engine Optimization for Large Language Models
Figure 3 for Adversarial Search Engine Optimization for Large Language Models
Figure 4 for Adversarial Search Engine Optimization for Large Language Models
Viaarxiv icon

Blind Baselines Beat Membership Inference Attacks for Foundation Models

Add code
Jun 23, 2024
Viaarxiv icon

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Add code
Jun 19, 2024
Figure 1 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 2 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 3 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 4 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Viaarxiv icon

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Add code
Jun 12, 2024
Figure 1 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 2 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 3 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 4 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Viaarxiv icon

Evaluations of Machine Learning Privacy Defenses are Misleading

Add code
Apr 26, 2024
Figure 1 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 2 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 3 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 4 for Evaluations of Machine Learning Privacy Defenses are Misleading
Viaarxiv icon

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Add code
Apr 22, 2024
Figure 1 for Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Figure 2 for Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Figure 3 for Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Figure 4 for Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Viaarxiv icon

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

Add code
Mar 30, 2024
Figure 1 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 2 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 3 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 4 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Viaarxiv icon