Picture for Sicheng Zuo

Sicheng Zuo

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

Add code
Apr 01, 2026
Viaarxiv icon

Vega: Learning to Drive with Natural Language Instructions

Add code
Mar 26, 2026
Viaarxiv icon

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding

Add code
Mar 19, 2026
Viaarxiv icon

DVGT: Driving Visual Geometry Transformer

Add code
Dec 18, 2025
Figure 1 for DVGT: Driving Visual Geometry Transformer
Figure 2 for DVGT: Driving Visual Geometry Transformer
Figure 3 for DVGT: Driving Visual Geometry Transformer
Figure 4 for DVGT: Driving Visual Geometry Transformer
Viaarxiv icon

QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction

Add code
Jun 12, 2025
Viaarxiv icon

GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction

Add code
Dec 13, 2024
Figure 1 for GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
Figure 2 for GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
Figure 3 for GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
Figure 4 for GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
Viaarxiv icon

GaussianAD: Gaussian-Centric End-to-End Autonomous Driving

Add code
Dec 13, 2024
Figure 1 for GaussianAD: Gaussian-Centric End-to-End Autonomous Driving
Figure 2 for GaussianAD: Gaussian-Centric End-to-End Autonomous Driving
Figure 3 for GaussianAD: Gaussian-Centric End-to-End Autonomous Driving
Figure 4 for GaussianAD: Gaussian-Centric End-to-End Autonomous Driving
Viaarxiv icon

Doe-1: Closed-Loop Autonomous Driving with Large World Model

Add code
Dec 12, 2024
Viaarxiv icon

GPD-1: Generative Pre-training for Driving

Add code
Dec 11, 2024
Figure 1 for GPD-1: Generative Pre-training for Driving
Figure 2 for GPD-1: Generative Pre-training for Driving
Figure 3 for GPD-1: Generative Pre-training for Driving
Figure 4 for GPD-1: Generative Pre-training for Driving
Viaarxiv icon

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

Add code
Dec 05, 2024
Figure 1 for EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Figure 2 for EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Figure 3 for EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Figure 4 for EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Viaarxiv icon