Picture for Neil Perry

Neil Perry

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Add code
Mar 16, 2026
Viaarxiv icon

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Add code
Dec 10, 2025
Figure 1 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 2 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 3 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 4 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Viaarxiv icon

Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models

Add code
Aug 15, 2024
Figure 1 for Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models
Figure 2 for Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models
Figure 3 for Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models
Figure 4 for Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models
Viaarxiv icon