Picture for Xiaobin Hu

Xiaobin Hu

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

Add code
Mar 10, 2026
Viaarxiv icon

The Trinity of Consistency as a Defining Principle for General World Models

Add code
Feb 26, 2026
Viaarxiv icon

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Add code
Feb 02, 2026
Viaarxiv icon

Dual Latent Memory for Visual Multi-agent System

Add code
Jan 31, 2026
Viaarxiv icon

Large-Scale Multidimensional Knowledge Profiling of Scientific Literature

Add code
Jan 21, 2026
Viaarxiv icon

M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding

Add code
Jan 13, 2026
Viaarxiv icon

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Add code
Jan 11, 2026
Viaarxiv icon

FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing

Add code
Jan 06, 2026
Viaarxiv icon

Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Add code
Dec 30, 2025
Viaarxiv icon

The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection

Add code
Dec 23, 2025
Viaarxiv icon