Picture for Florian Tramèr

Florian Tramèr

Query-Based Adversarial Prompt Generation

Add code
Feb 19, 2024
Figure 1 for Query-Based Adversarial Prompt Generation
Figure 2 for Query-Based Adversarial Prompt Generation
Figure 3 for Query-Based Adversarial Prompt Generation
Figure 4 for Query-Based Adversarial Prompt Generation
Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Nov 28, 2023
Figure 1 for Scalable Extraction of Training Data from (Production) Language Models
Figure 2 for Scalable Extraction of Training Data from (Production) Language Models
Figure 3 for Scalable Extraction of Training Data from (Production) Language Models
Figure 4 for Scalable Extraction of Training Data from (Production) Language Models
Viaarxiv icon

Universal Jailbreak Backdoors from Poisoned Human Feedback

Add code
Nov 24, 2023
Figure 1 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 2 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 3 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 4 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Sep 11, 2023
Figure 1 for Privacy Side Channels in Machine Learning Systems
Figure 2 for Privacy Side Channels in Machine Learning Systems
Figure 3 for Privacy Side Channels in Machine Learning Systems
Figure 4 for Privacy Side Channels in Machine Learning Systems
Viaarxiv icon

Evaluating Superhuman Models with Consistency Checks

Add code
Jun 19, 2023
Viaarxiv icon

Evading Black-box Classifiers Without Breaking Eggs

Add code
Jun 05, 2023
Viaarxiv icon

Randomness in ML Defenses Helps Persistent Attackers and Hinders Evaluators

Add code
Feb 27, 2023
Viaarxiv icon

Poisoning Web-Scale Training Datasets is Practical

Add code
Feb 20, 2023
Viaarxiv icon

Tight Auditing of Differentially Private Machine Learning

Add code
Feb 15, 2023
Viaarxiv icon

Extracting Training Data from Diffusion Models

Add code
Jan 30, 2023
Viaarxiv icon