Picture for Wei Yin

Wei Yin

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

Add code
Jul 10, 2025
Viaarxiv icon

A Survey: Learning Embodied Intelligence from Physical Simulators and World Models

Add code
Jul 01, 2025
Viaarxiv icon

TextAtari: 100K Frames Game Playing with Language Agents

Add code
Jun 04, 2025
Viaarxiv icon

Reinforced Reasoning for Embodied Planning

Add code
May 28, 2025
Viaarxiv icon

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

Add code
Mar 07, 2025
Viaarxiv icon

DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT

Add code
Dec 30, 2024
Figure 1 for DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Figure 2 for DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Figure 3 for DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Figure 4 for DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Viaarxiv icon

DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT

Add code
Dec 27, 2024
Figure 1 for DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT
Figure 2 for DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT
Figure 3 for DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT
Figure 4 for DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT
Viaarxiv icon

RoMeO: Robust Metric Visual Odometry

Add code
Dec 16, 2024
Viaarxiv icon

Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

Add code
Nov 26, 2024
Figure 1 for Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Figure 2 for Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Figure 3 for Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Figure 4 for Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Viaarxiv icon

Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving

Add code
Oct 29, 2024
Figure 1 for Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving
Figure 2 for Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving
Figure 3 for Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving
Figure 4 for Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving
Viaarxiv icon