Picture for Haodong Duan

Haodong Duan

OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems

Add code
Jun 12, 2025
Viaarxiv icon

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings

Add code
Jun 05, 2025
Viaarxiv icon

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Add code
May 29, 2025
Viaarxiv icon

Visual Agentic Reinforcement Fine-Tuning

Add code
May 20, 2025
Viaarxiv icon

GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling

Add code
Apr 30, 2025
Viaarxiv icon

MM-IFEngine: Towards Multimodal Instruction Following

Add code
Apr 10, 2025
Viaarxiv icon

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Add code
Apr 03, 2025
Viaarxiv icon

LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?

Add code
Mar 25, 2025
Viaarxiv icon

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Add code
Mar 19, 2025
Viaarxiv icon

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Add code
Mar 13, 2025
Viaarxiv icon