Picture for Mike Zheng Shou

Mike Zheng Shou

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Add code
Apr 08, 2026
Viaarxiv icon

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Add code
Apr 06, 2026
Viaarxiv icon

UENR-600K: A Large-Scale Physically Grounded Dataset for Nighttime Video Deraining

Add code
Apr 06, 2026
Viaarxiv icon

P-Flow: Prompting Visual Effects Generation

Add code
Mar 23, 2026
Viaarxiv icon

Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance

Add code
Mar 05, 2026
Viaarxiv icon

Semantic-Contact Fields for Category-Level Generalizable Tactile Tool Manipulation

Add code
Feb 14, 2026
Viaarxiv icon

Olaf-World: Orienting Latent Actions for Video World Modeling

Add code
Feb 10, 2026
Viaarxiv icon

World-VLA-Loop: Closed-Loop Learning of Video World Model and VLA Policy

Add code
Feb 06, 2026
Viaarxiv icon

ShowUI-Aloha: Human-Taught GUI Agent

Add code
Jan 12, 2026
Viaarxiv icon

FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection

Add code
Jan 07, 2026
Viaarxiv icon