Picture for Yuhao Dong

Yuhao Dong

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Add code
Apr 06, 2026
Viaarxiv icon

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Add code
Apr 06, 2026
Viaarxiv icon

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning

Add code
Mar 27, 2026
Viaarxiv icon

Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models

Add code
Mar 18, 2026
Viaarxiv icon

VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining

Add code
Mar 16, 2026
Viaarxiv icon

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Add code
Feb 09, 2026
Viaarxiv icon

Kimi K2.5: Visual Agentic Intelligence

Add code
Feb 02, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

Visual Grounding from Event Cameras

Add code
Sep 11, 2025
Viaarxiv icon

Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras

Add code
Jul 23, 2025
Viaarxiv icon