Picture for John Wilkinson

John Wilkinson

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Add code
Mar 11, 2026
Viaarxiv icon

Quantifying Frontier LLM Capabilities for Container Sandbox Escape

Add code
Mar 01, 2026
Viaarxiv icon