Picture for Harsh Seth

Harsh Seth

Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation

Add code
Aug 07, 2025
Viaarxiv icon