Picture for Zuxuan Wu

Zuxuan Wu

Fudan University

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

Add code
Jun 09, 2026
Viaarxiv icon

IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder

Add code
Jun 09, 2026
Viaarxiv icon

OmniGen-AR: AutoRegressive Any-to-Image Generation

Add code
Jun 08, 2026
Viaarxiv icon

DisCo: World Models with Discrete Camera Motion Control

Add code
Jun 06, 2026
Viaarxiv icon

ActiveMimic: Egocentric Video Pretraining with Active Perception

Add code
Jun 04, 2026
Viaarxiv icon

EvoMemNav: Efficient Self-Evolving Fine-Grained Memory for Zero-Shot Embodied Navigation

Add code
Jun 02, 2026
Viaarxiv icon

CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping

Add code
May 29, 2026
Viaarxiv icon

VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models

Add code
May 28, 2026
Viaarxiv icon

Compositional Text-to-Image Generation Via Region-aware Bimodal Direct Preference Optimization

Add code
May 27, 2026
Viaarxiv icon

Channel-wise Vector Quantization

Add code
May 25, 2026
Viaarxiv icon