Picture for Tue Le

Tue Le

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Add code
Dec 23, 2025
Viaarxiv icon