Picture for Vincent Tu

Vincent Tu

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Add code
Apr 01, 2026
Viaarxiv icon

The Unreasonable Effectiveness of Scaling Agents for Computer Use

Add code
Oct 02, 2025
Viaarxiv icon

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Add code
Apr 01, 2025
Figure 1 for Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Figure 2 for Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Figure 3 for Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Figure 4 for Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Viaarxiv icon