Picture for Nicolas Baldwin

Nicolas Baldwin

Interactive Evaluation of Large Language Models for Multi-Requirement Software Engineering Tasks

Add code
Aug 26, 2025
Viaarxiv icon

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Add code
Jul 03, 2025
Viaarxiv icon