Picture for Yi Ling Yu

Yi Ling Yu

DecisionBench: A Benchmark for Emergent Delegation in Long-Horizon Agentic Workflows

Add code
May 18, 2026
Viaarxiv icon

AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment

Add code
Apr 27, 2026
Viaarxiv icon