Picture for Jiaheng Liu

Jiaheng Liu

P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

Add code
Jun 09, 2026
Viaarxiv icon

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Add code
Jun 07, 2026
Viaarxiv icon

OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning

Add code
Jun 07, 2026
Viaarxiv icon

Knowledge Index of Noah's Ark

Add code
Jun 04, 2026
Viaarxiv icon

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?

Add code
Jun 01, 2026
Viaarxiv icon

TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation

Add code
Jun 01, 2026
Viaarxiv icon

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Add code
Jun 01, 2026
Viaarxiv icon

AlphaCrafter: A Full-Stack Multi-Agent Framework for Cross-Sectional Quantitative Trading

Add code
May 07, 2026
Viaarxiv icon

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors

Add code
Apr 23, 2026
Viaarxiv icon

WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models

Add code
Apr 20, 2026
Viaarxiv icon