Picture for Florian Tramèr

Florian Tramèr

Membership Inference Attacks on Sequence Models

Add code
Jun 05, 2025
Viaarxiv icon

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Add code
May 18, 2025
Viaarxiv icon

LLMs unlock new paths to monetizing exploits

Add code
May 16, 2025
Viaarxiv icon

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

Add code
Apr 14, 2025
Viaarxiv icon

Defeating Prompt Injections by Design

Add code
Mar 24, 2025
Viaarxiv icon

AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses

Add code
Mar 03, 2025
Viaarxiv icon

Adversarial ML Problems Are Getting Harder to Solve and to Evaluate

Add code
Feb 04, 2025
Viaarxiv icon

International AI Safety Report

Add code
Jan 29, 2025
Viaarxiv icon

Consistency Checks for Language Model Forecasters

Add code
Dec 24, 2024
Viaarxiv icon

Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust

Add code
Nov 22, 2024
Figure 1 for Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust
Figure 2 for Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust
Figure 3 for Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust
Viaarxiv icon