Picture for Haiyu Xu

Haiyu Xu

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery

Add code
Apr 28, 2026
Viaarxiv icon

Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench

Add code
Oct 30, 2025
Viaarxiv icon