Picture for Florian Tramèr

Florian Tramèr

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Add code
Jun 12, 2024
Figure 1 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 2 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 3 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Figure 4 for Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Viaarxiv icon

Evaluations of Machine Learning Privacy Defenses are Misleading

Add code
Apr 26, 2024
Viaarxiv icon

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Add code
Apr 22, 2024
Viaarxiv icon

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

Add code
Mar 30, 2024
Viaarxiv icon

Query-Based Adversarial Prompt Generation

Add code
Feb 19, 2024
Figure 1 for Query-Based Adversarial Prompt Generation
Figure 2 for Query-Based Adversarial Prompt Generation
Figure 3 for Query-Based Adversarial Prompt Generation
Figure 4 for Query-Based Adversarial Prompt Generation
Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Nov 28, 2023
Figure 1 for Scalable Extraction of Training Data from (Production) Language Models
Figure 2 for Scalable Extraction of Training Data from (Production) Language Models
Figure 3 for Scalable Extraction of Training Data from (Production) Language Models
Figure 4 for Scalable Extraction of Training Data from (Production) Language Models
Viaarxiv icon

Universal Jailbreak Backdoors from Poisoned Human Feedback

Add code
Nov 24, 2023
Figure 1 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 2 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 3 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Figure 4 for Universal Jailbreak Backdoors from Poisoned Human Feedback
Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Sep 11, 2023
Figure 1 for Privacy Side Channels in Machine Learning Systems
Figure 2 for Privacy Side Channels in Machine Learning Systems
Figure 3 for Privacy Side Channels in Machine Learning Systems
Figure 4 for Privacy Side Channels in Machine Learning Systems
Viaarxiv icon

Evaluating Superhuman Models with Consistency Checks

Add code
Jun 19, 2023
Viaarxiv icon

Evading Black-box Classifiers Without Breaking Eggs

Add code
Jun 05, 2023
Viaarxiv icon