Picture for Hongwei Xie

Hongwei Xie

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Add code
Apr 20, 2026
Viaarxiv icon

DriveVA: Video Action Models are Zero-Shot Drivers

Add code
Apr 05, 2026
Viaarxiv icon

UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

Add code
Apr 02, 2026
Viaarxiv icon

MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning

Add code
Dec 16, 2025
Figure 1 for MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning
Figure 2 for MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning
Figure 3 for MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning
Figure 4 for MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning
Viaarxiv icon

Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving

Add code
May 13, 2025
Figure 1 for Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Figure 2 for Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Figure 3 for Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Figure 4 for Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Viaarxiv icon

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation

Add code
Mar 25, 2025
Viaarxiv icon

MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving

Add code
Mar 20, 2025
Viaarxiv icon

Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation

Add code
Mar 10, 2025
Figure 1 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 2 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 3 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 4 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Viaarxiv icon

SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field

Add code
Mar 21, 2024
Figure 1 for SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Figure 2 for SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Figure 3 for SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Figure 4 for SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Viaarxiv icon

Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection

Add code
May 30, 2022
Figure 1 for Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection
Figure 2 for Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection
Figure 3 for Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection
Figure 4 for Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection
Viaarxiv icon