Picture for Yuhang Zheng

Yuhang Zheng

VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis

Add code
Apr 23, 2026
Viaarxiv icon

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance

Add code
Apr 22, 2026
Viaarxiv icon

DeFakeQ: Enabling Real-Time Deepfake Detection on Edge Devices via Adaptive Bidirectional Quantization

Add code
Apr 10, 2026
Viaarxiv icon

Adaptive Local Frequency Filtering for Fourier-Encoded Implicit Neural Representations

Add code
Apr 03, 2026
Viaarxiv icon

UniBioTransfer: A Unified Framework for Multiple Biometrics Transfer

Add code
Mar 20, 2026
Viaarxiv icon

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation

Add code
Mar 19, 2026
Viaarxiv icon

World In Your Hands: A Large-Scale and Open-source Ecosystem for Learning Human-centric Manipulation in the Wild

Add code
Dec 30, 2025
Viaarxiv icon

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Add code
Dec 29, 2025
Viaarxiv icon

Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving

Add code
Dec 03, 2024
Figure 1 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 2 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 3 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Figure 4 for Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Viaarxiv icon

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Add code
Jun 04, 2024
Figure 1 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 2 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 3 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Figure 4 for PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Viaarxiv icon