Picture for Xueyuan Hao

Xueyuan Hao

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Add code
Sep 30, 2025
Viaarxiv icon