Picture for Wenzhao Zheng

Wenzhao Zheng

Vega: Learning to Drive with Natural Language Instructions

Add code
Mar 26, 2026
Viaarxiv icon

UniQueR: Unified Query-based Feedforward 3D Reconstruction

Add code
Mar 24, 2026
Viaarxiv icon

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding

Add code
Mar 19, 2026
Viaarxiv icon

Measuring 3D Spatial Geometric Consistency in Dynamic Generated Videos

Add code
Mar 19, 2026
Viaarxiv icon

TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

Add code
Feb 09, 2026
Viaarxiv icon

Moaw: Unleashing Motion Awareness for Video Diffusion Models

Add code
Jan 19, 2026
Viaarxiv icon

NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization

Add code
Dec 29, 2025
Viaarxiv icon

SFTok: Bridging the Performance Gap in Discrete Tokenizers

Add code
Dec 18, 2025
Viaarxiv icon

DVGT: Driving Visual Geometry Transformer

Add code
Dec 18, 2025
Figure 1 for DVGT: Driving Visual Geometry Transformer
Figure 2 for DVGT: Driving Visual Geometry Transformer
Figure 3 for DVGT: Driving Visual Geometry Transformer
Figure 4 for DVGT: Driving Visual Geometry Transformer
Viaarxiv icon

Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

Add code
Dec 17, 2025
Viaarxiv icon