Picture for Milad Nasr

Milad Nasr

Cascading Adversarial Bias from Injection to Distillation in Language Models

Add code
May 30, 2025
Viaarxiv icon

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Lessons from Defending Gemini Against Indirect Prompt Injections

Add code
May 20, 2025
Viaarxiv icon

LLMs unlock new paths to monetizing exploits

Add code
May 16, 2025
Viaarxiv icon

Privacy Auditing of Large Language Models

Add code
Mar 09, 2025
Viaarxiv icon

AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses

Add code
Mar 03, 2025
Viaarxiv icon

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Add code
Jan 13, 2025
Figure 1 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 2 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 3 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 4 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Viaarxiv icon

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Viaarxiv icon

SoK: Watermarking for AI-Generated Content

Add code
Nov 27, 2024
Viaarxiv icon

Remote Timing Attacks on Efficient Language Model Inference

Add code
Oct 22, 2024
Figure 1 for Remote Timing Attacks on Efficient Language Model Inference
Figure 2 for Remote Timing Attacks on Efficient Language Model Inference
Figure 3 for Remote Timing Attacks on Efficient Language Model Inference
Figure 4 for Remote Timing Attacks on Efficient Language Model Inference
Viaarxiv icon