Picture for Nan Duan

Nan Duan

AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memory

Add code
Jun 10, 2026
Viaarxiv icon

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Add code
Jun 10, 2026
Viaarxiv icon

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Add code
Jun 08, 2026
Viaarxiv icon

Echo-Memory: A Controlled Study of Memory in Action World Models

Add code
Jun 08, 2026
Viaarxiv icon

Ultra Flash: Scaling Real-Time Streaming Video Generation to High Resolutions

Add code
Jun 08, 2026
Viaarxiv icon

Harnessing Streaming Video in the Wild

Add code
Jun 07, 2026
Viaarxiv icon

Echo-Infinity: Learning Evolving Memory for Real-Time Infinite Video Generation

Add code
Jun 03, 2026
Viaarxiv icon

Right Makes Might: Aligning Verified Hidden States Empowers RL Reasoning

Add code
Jun 02, 2026
Viaarxiv icon

AdaCodec: A Predictive Visual Code for Video MLLMs

Add code
Jun 01, 2026
Viaarxiv icon

Embodied3DBench: Benchmarking Low-Level Embodied Spatial Intelligence of Vision Language Models

Add code
May 27, 2026
Viaarxiv icon