Picture for Jiaheng Liu

Jiaheng Liu

AlphaCrafter: A Full-Stack Multi-Agent Framework for Cross-Sectional Quantitative Trading

Add code
May 07, 2026
Viaarxiv icon

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors

Add code
Apr 23, 2026
Viaarxiv icon

WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models

Add code
Apr 20, 2026
Viaarxiv icon

DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Add code
Apr 16, 2026
Viaarxiv icon

CodeTracer: Towards Traceable Agent States

Add code
Apr 14, 2026
Viaarxiv icon

Long-form RewardBench: Evaluating Reward Models for Long-form Generation

Add code
Mar 13, 2026
Viaarxiv icon

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Add code
Feb 26, 2026
Viaarxiv icon

EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies

Add code
Feb 11, 2026
Viaarxiv icon

WorldTravel: A Realistic Multimodal Travel-Planning Benchmark with Tightly Coupled Constraints

Add code
Feb 09, 2026
Viaarxiv icon

Vibe AIGC: A New Paradigm for Content Generation via Agentic Orchestration

Add code
Feb 05, 2026
Viaarxiv icon