Picture for Yu-Gang Jiang

Yu-Gang Jiang

Fudan University

Unison: Benchmarking Unified Multimodal Models via Synergistic Understanding and Generation

Add code
Jun 25, 2026
Viaarxiv icon

Advancing Omnimodal Embodied Agents from Isolated Skills to Everyday Physical Autonomy

Add code
Jun 25, 2026
Viaarxiv icon

Event-Aware Instructed Assistant for Referring Video Segmentation

Add code
Jun 25, 2026
Viaarxiv icon

ThinkingVLA: Interleaved Vision and Language Reasoning for Robotic Manipulation

Add code
Jun 16, 2026
Viaarxiv icon

RepWAM: World Action Modeling with Representation Visual-Action Tokenizers

Add code
Jun 11, 2026
Viaarxiv icon

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

Add code
Jun 09, 2026
Viaarxiv icon

IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder

Add code
Jun 09, 2026
Viaarxiv icon

UniDexTok: A Unified Dexterous Hand Tokenizer from Real Data

Add code
Jun 09, 2026
Viaarxiv icon

OmniGen-AR: AutoRegressive Any-to-Image Generation

Add code
Jun 08, 2026
Viaarxiv icon

Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data

Add code
Jun 07, 2026
Viaarxiv icon