Picture for Yuhang Zheng

Yuhang Zheng

UniBioTransfer: A Unified Framework for Multiple Biometrics Transfer

Add code
Mar 20, 2026
Viaarxiv icon

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation

Add code
Mar 19, 2026
Viaarxiv icon

World In Your Hands: A Large-Scale and Open-source Ecosystem for Learning Human-centric Manipulation in the Wild

Add code
Dec 30, 2025
Viaarxiv icon

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Add code
Dec 29, 2025
Viaarxiv icon

Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving

Add code
Dec 03, 2024
Figure 1 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 2 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 3 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 4 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Viaarxiv icon

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Add code
Jun 04, 2024
Figure 1 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 2 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 3 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 4 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Viaarxiv icon

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Add code
Mar 28, 2024
Figure 1 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 2 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 3 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Figure 4 for TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Viaarxiv icon

GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping

Add code
Mar 14, 2024
Figure 1 for GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Figure 2 for GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Figure 3 for GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Figure 4 for GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Viaarxiv icon

MonoOcc: Digging into Monocular Semantic Occupancy Prediction

Add code
Mar 13, 2024
Figure 1 for MonoOcc: Digging into Monocular Semantic Occupancy Prediction
Figure 2 for MonoOcc: Digging into Monocular Semantic Occupancy Prediction
Figure 3 for MonoOcc: Digging into Monocular Semantic Occupancy Prediction
Figure 4 for MonoOcc: Digging into Monocular Semantic Occupancy Prediction
Viaarxiv icon

Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images

Add code
Feb 08, 2024
Figure 1 for Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Figure 2 for Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Figure 3 for Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Figure 4 for Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Viaarxiv icon