Picture for Florian Tramèr

Florian Tramèr

Extracting Training Data from Document-Based VQA Models

Add code
Jul 11, 2024
Viaarxiv icon

Adversarial Search Engine Optimization for Large Language Models

Add code
Jun 26, 2024
Figure 1 for Adversarial Search Engine Optimization for Large Language Models
Figure 2 for Adversarial Search Engine Optimization for Large Language Models
Figure 3 for Adversarial Search Engine Optimization for Large Language Models
Figure 4 for Adversarial Search Engine Optimization for Large Language Models
Viaarxiv icon

Blind Baselines Beat Membership Inference Attacks for Foundation Models

Add code
Jun 23, 2024
Figure 1 for Blind Baselines Beat Membership Inference Attacks for Foundation Models
Figure 2 for Blind Baselines Beat Membership Inference Attacks for Foundation Models
Figure 3 for Blind Baselines Beat Membership Inference Attacks for Foundation Models
Figure 4 for Blind Baselines Beat Membership Inference Attacks for Foundation Models
Viaarxiv icon

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Add code
Jun 19, 2024
Viaarxiv icon

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Add code
Jun 12, 2024
Viaarxiv icon

Evaluations of Machine Learning Privacy Defenses are Misleading

Add code
Apr 26, 2024
Figure 1 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 2 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 3 for Evaluations of Machine Learning Privacy Defenses are Misleading
Figure 4 for Evaluations of Machine Learning Privacy Defenses are Misleading
Viaarxiv icon

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Add code
Apr 22, 2024
Viaarxiv icon

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

Add code
Mar 30, 2024
Figure 1 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 2 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 3 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Figure 4 for Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Viaarxiv icon

Query-Based Adversarial Prompt Generation

Add code
Feb 19, 2024
Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Nov 28, 2023
Figure 1 for Scalable Extraction of Training Data from (Production) Language Models
Figure 2 for Scalable Extraction of Training Data from (Production) Language Models
Figure 3 for Scalable Extraction of Training Data from (Production) Language Models
Figure 4 for Scalable Extraction of Training Data from (Production) Language Models
Viaarxiv icon