Picture for Chen Gao

Chen Gao

UniVLR: Unifying Text and Vision in Visual Latent Reasoning for Multimodal LLMs

Add code
May 12, 2026
Viaarxiv icon

iWorld-Bench: A Benchmark for Interactive World Models with a Unified Action Generation Framework

Add code
May 06, 2026
Viaarxiv icon

LoViF 2026 The First Challenge on Holistic Quality Assessment for 4D World Model (PhyScore)

Add code
May 06, 2026
Viaarxiv icon

A Benchmark for Interactive World Models with a Unified Action Generation Framework

Add code
May 05, 2026
Viaarxiv icon

WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models

Add code
Apr 09, 2026
Viaarxiv icon

How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace

Add code
Apr 09, 2026
Viaarxiv icon

BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination

Add code
Apr 07, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

Semantic Audio-Visual Navigation in Continuous Environments

Add code
Mar 20, 2026
Viaarxiv icon

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation

Add code
Mar 19, 2026
Viaarxiv icon