Picture for Malachi Hamada

Malachi Hamada

Language Models Don't Know What You Want: Evaluating Personalization in Deep Research Needs Real Users

Add code
Mar 17, 2026
Viaarxiv icon

AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

Add code
Oct 24, 2025
Viaarxiv icon