Picture for Milad Nasr

Milad Nasr

Rebellion: Noise-Robust Reasoning Training for Audio Reasoning Models

Add code
Nov 12, 2025
Viaarxiv icon

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

Add code
Oct 02, 2025
Figure 1 for Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks
Figure 2 for Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks
Figure 3 for Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks
Figure 4 for Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks
Viaarxiv icon

Cascading Adversarial Bias from Injection to Distillation in Language Models

Add code
May 30, 2025
Figure 1 for Cascading Adversarial Bias from Injection to Distillation in Language Models
Figure 2 for Cascading Adversarial Bias from Injection to Distillation in Language Models
Figure 3 for Cascading Adversarial Bias from Injection to Distillation in Language Models
Figure 4 for Cascading Adversarial Bias from Injection to Distillation in Language Models
Viaarxiv icon

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Lessons from Defending Gemini Against Indirect Prompt Injections

Add code
May 20, 2025
Viaarxiv icon

LLMs unlock new paths to monetizing exploits

Add code
May 16, 2025
Figure 1 for LLMs unlock new paths to monetizing exploits
Figure 2 for LLMs unlock new paths to monetizing exploits
Figure 3 for LLMs unlock new paths to monetizing exploits
Figure 4 for LLMs unlock new paths to monetizing exploits
Viaarxiv icon

Privacy Auditing of Large Language Models

Add code
Mar 09, 2025
Viaarxiv icon

AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses

Add code
Mar 03, 2025
Figure 1 for AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses
Figure 2 for AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses
Figure 3 for AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses
Figure 4 for AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses
Viaarxiv icon

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Add code
Jan 13, 2025
Figure 1 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 2 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 3 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 4 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Viaarxiv icon

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Figure 1 for On Evaluating the Durability of Safeguards for Open-Weight LLMs
Figure 2 for On Evaluating the Durability of Safeguards for Open-Weight LLMs
Figure 3 for On Evaluating the Durability of Safeguards for Open-Weight LLMs
Figure 4 for On Evaluating the Durability of Safeguards for Open-Weight LLMs
Viaarxiv icon