Picture for Jyoutir Raj

Jyoutir Raj

ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense

Add code
Mar 02, 2026
Viaarxiv icon

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Add code
Feb 18, 2026
Viaarxiv icon