Picture for Claude Fachkha

Claude Fachkha

A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents

Add code
Dec 23, 2025
Viaarxiv icon