Picture for Ameet Talwalkar

Ameet Talwalkar

UC Berkeley

Comparing Developer and LLM Biases in Code Evaluation

Add code
Mar 25, 2026
Viaarxiv icon

Learn Hard Problems During RL with Reference Guided Fine-tuning

Add code
Mar 05, 2026
Viaarxiv icon

GameDevBench: Evaluating Agentic Capabilities Through Game Development

Add code
Feb 11, 2026
Viaarxiv icon

Completion $ eq$ Collaboration: Scaling Collaborative Effort with Agents

Add code
Oct 30, 2025
Viaarxiv icon

Towards Community-Driven Agents for Machine Learning Engineering

Add code
Jun 25, 2025
Viaarxiv icon

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

Add code
Jun 09, 2025
Figure 1 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 2 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 3 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 4 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Viaarxiv icon

Sample Complexity and Representation Ability of Test-time Scaling Paradigms

Add code
Jun 05, 2025
Viaarxiv icon

A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

Add code
May 22, 2025
Viaarxiv icon

This Time is Different: An Observability Perspective on Time Series Foundation Models

Add code
May 20, 2025
Viaarxiv icon

CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

Add code
May 13, 2025
Viaarxiv icon