Picture for Haoyi Zhu

Haoyi Zhu

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning

Add code
Jul 17, 2025
Viaarxiv icon

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

Add code
Jul 01, 2025
Viaarxiv icon

CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning

Add code
May 22, 2025
Viaarxiv icon

Aether: Geometric-Aware Unified World Modeling

Add code
Mar 25, 2025
Viaarxiv icon

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

Add code
Nov 21, 2024
Figure 1 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 2 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 3 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 4 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Viaarxiv icon

DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild

Add code
Nov 20, 2024
Figure 1 for DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
Figure 2 for DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
Figure 3 for DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
Figure 4 for DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
Viaarxiv icon

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

Add code
Oct 10, 2024
Figure 1 for SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Figure 2 for SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Figure 3 for SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Figure 4 for SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Viaarxiv icon

Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

Add code
Feb 04, 2024
Figure 1 for Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Figure 2 for Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Figure 3 for Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Figure 4 for Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Viaarxiv icon

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

Add code
Oct 13, 2023
Figure 1 for PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Figure 2 for PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Figure 3 for PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Figure 4 for PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Viaarxiv icon

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Add code
Oct 12, 2023
Figure 1 for UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Figure 2 for UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Figure 3 for UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Figure 4 for UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Viaarxiv icon