Picture for Xiaowei Chi

Xiaowei Chi

WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation

Add code
Oct 08, 2025
Viaarxiv icon

Can World Models Benefit VLMs for World Dynamics?

Add code
Oct 01, 2025
Viaarxiv icon

WoW: Towards a World omniscient World model Through Embodied Interaction

Add code
Sep 26, 2025
Viaarxiv icon

BEVUDA++: Geometric-aware Unsupervised Domain Adaptation for Multi-View 3D Object Detection

Add code
Sep 17, 2025
Viaarxiv icon

MinD: Unified Visual Imagination and Control via Hierarchical World Models

Add code
Jun 23, 2025
Figure 1 for MinD: Unified Visual Imagination and Control via Hierarchical World Models
Figure 2 for MinD: Unified Visual Imagination and Control via Hierarchical World Models
Figure 3 for MinD: Unified Visual Imagination and Control via Hierarchical World Models
Figure 4 for MinD: Unified Visual Imagination and Control via Hierarchical World Models
Viaarxiv icon

ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance

Add code
Apr 23, 2025
Figure 1 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 2 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 3 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 4 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Viaarxiv icon

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Add code
Mar 26, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Add code
Jan 15, 2025
Viaarxiv icon

Large Motion Video Autoencoding with Cross-modal Video VAE

Add code
Dec 23, 2024
Figure 1 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 2 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 3 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 4 for Large Motion Video Autoencoding with Cross-modal Video VAE
Viaarxiv icon