Picture for Matt Fredrikson

Matt Fredrikson

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition

Add code
Dec 31, 2025
Viaarxiv icon

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Add code
Dec 10, 2025
Figure 1 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 2 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 3 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 4 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Viaarxiv icon

Evaluating Language Model Reasoning about Confidential Information

Add code
Aug 27, 2025
Viaarxiv icon

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition

Add code
Jul 28, 2025
Viaarxiv icon

Transferable Adversarial Attacks on Black-Box Vision-Language Models

Add code
May 02, 2025
Figure 1 for Transferable Adversarial Attacks on Black-Box Vision-Language Models
Figure 2 for Transferable Adversarial Attacks on Black-Box Vision-Language Models
Figure 3 for Transferable Adversarial Attacks on Black-Box Vision-Language Models
Figure 4 for Transferable Adversarial Attacks on Black-Box Vision-Language Models
Viaarxiv icon

Is Your Text-to-Image Model Robust to Caption Noise?

Add code
Dec 27, 2024
Viaarxiv icon

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Add code
Oct 11, 2024
Figure 1 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 2 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 3 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 4 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Viaarxiv icon

Improving Alignment and Robustness with Circuit Breakers

Add code
Jun 10, 2024
Figure 1 for Improving Alignment and Robustness with Circuit Breakers
Figure 2 for Improving Alignment and Robustness with Circuit Breakers
Figure 3 for Improving Alignment and Robustness with Circuit Breakers
Figure 4 for Improving Alignment and Robustness with Circuit Breakers
Viaarxiv icon

Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations

Add code
Jun 07, 2024
Figure 1 for Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations
Figure 2 for Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations
Figure 3 for Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations
Figure 4 for Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations
Viaarxiv icon

Improving Alignment and Robustness with Short Circuiting

Add code
Jun 06, 2024
Figure 1 for Improving Alignment and Robustness with Short Circuiting
Figure 2 for Improving Alignment and Robustness with Short Circuiting
Figure 3 for Improving Alignment and Robustness with Short Circuiting
Figure 4 for Improving Alignment and Robustness with Short Circuiting
Viaarxiv icon