Picture for Yishuo Yuan

Yishuo Yuan

EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies

Add code
Feb 11, 2026
Viaarxiv icon

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Add code
Dec 14, 2025
Viaarxiv icon