Picture for Weihao Yuan

Weihao Yuan

3DThinkVLA: Endowing Vision-Language-Action Models with Latent 3D Priors via 3D-Thinking-Guided Co-training

Add code
Jun 03, 2026
Viaarxiv icon

Touch-R1: Reinforcing Touch Reasoning in MLLMs

Add code
May 26, 2026
Viaarxiv icon

FocusVLA: Focused Visual Utilization for Vision-Language-Action Models

Add code
Mar 30, 2026
Viaarxiv icon

ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation

Add code
Dec 09, 2025
Figure 1 for ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Figure 2 for ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Figure 3 for ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Figure 4 for ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Viaarxiv icon

OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression

Add code
Oct 16, 2025
Figure 1 for OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
Figure 2 for OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
Figure 3 for OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
Figure 4 for OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
Viaarxiv icon

PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image

Add code
Sep 09, 2025
Figure 1 for PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image
Figure 2 for PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image
Figure 3 for PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image
Figure 4 for PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image
Viaarxiv icon

DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration

Add code
Jun 16, 2025
Figure 1 for DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Figure 2 for DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Figure 3 for DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Figure 4 for DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Viaarxiv icon

PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images

Add code
Jun 16, 2025
Viaarxiv icon

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

Add code
Mar 13, 2025
Viaarxiv icon

LAM: Large Avatar Model for One-shot Animatable Gaussian Head

Add code
Feb 25, 2025
Viaarxiv icon