Picture for Yoram Bachrach

Yoram Bachrach

Microsoft Research

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Add code
Feb 09, 2026
Viaarxiv icon

Scaling Small Agents Through Strategy Auctions

Add code
Feb 02, 2026
Viaarxiv icon

Training AI Co-Scientists Using Rubric Rewards

Add code
Dec 29, 2025
Viaarxiv icon

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

Add code
Nov 19, 2025
Viaarxiv icon

Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Add code
Nov 17, 2025
Viaarxiv icon

Bootstrapping Task Spaces for Self-Improvement

Add code
Sep 04, 2025
Viaarxiv icon

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Add code
Jul 03, 2025
Viaarxiv icon

Modelling Mean-Field Games with Neural Ordinary Differential Equations

Add code
Apr 17, 2025
Viaarxiv icon

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Add code
Feb 20, 2025
Figure 1 for MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Figure 2 for MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Figure 3 for MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Figure 4 for MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Viaarxiv icon

Soft Condorcet Optimization for Ranking of General Agents

Add code
Nov 04, 2024
Figure 1 for Soft Condorcet Optimization for Ranking of General Agents
Figure 2 for Soft Condorcet Optimization for Ranking of General Agents
Figure 3 for Soft Condorcet Optimization for Ranking of General Agents
Figure 4 for Soft Condorcet Optimization for Ranking of General Agents
Viaarxiv icon