Picture for Akshay Sivaraman

Akshay Sivaraman

CRAB-Bench: Evaluating LLM Agents under Complex Task Dependencies and Human-aligned User Simulation

Add code
Jun 01, 2026
Viaarxiv icon