Picture for Jiayao Liu

Jiayao Liu

GeoAgentBench: A Dynamic Execution Benchmark for Tool-Augmented Agents in Spatial Analysis

Add code
Apr 15, 2026
Viaarxiv icon

JADE: Expert-Grounded Dynamic Evaluation for Open-Ended Professional Tasks

Add code
Feb 06, 2026
Viaarxiv icon