Picture for Wenzhao Zheng

Wenzhao Zheng

SFTok: Bridging the Performance Gap in Discrete Tokenizers

Add code
Dec 18, 2025
Viaarxiv icon

DVGT: Driving Visual Geometry Transformer

Add code
Dec 18, 2025
Viaarxiv icon

Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

Add code
Dec 17, 2025
Viaarxiv icon

Astra: General Interactive World Model with Autoregressive Denoising

Add code
Dec 15, 2025
Viaarxiv icon

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Add code
Dec 12, 2025
Viaarxiv icon

Terra: Explorable Native 3D World Model with Point Latents

Add code
Oct 16, 2025
Viaarxiv icon

StereoCarla: A High-Fidelity Driving Dataset for Generalizable Stereo

Add code
Sep 16, 2025
Viaarxiv icon

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline

Add code
Aug 06, 2025
Viaarxiv icon

Streaming 4D Visual Geometry Transformer

Add code
Jul 15, 2025
Figure 1 for Streaming 4D Visual Geometry Transformer
Figure 2 for Streaming 4D Visual Geometry Transformer
Figure 3 for Streaming 4D Visual Geometry Transformer
Figure 4 for Streaming 4D Visual Geometry Transformer
Viaarxiv icon

Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory

Add code
Jul 03, 2025
Viaarxiv icon