Picture for Li Yi

Li Yi

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Add code
Feb 18, 2025
Figure 1 for SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Figure 2 for SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Figure 3 for SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Figure 4 for SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Viaarxiv icon

DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References

Add code
Feb 13, 2025
Viaarxiv icon

GaussianAvatar-Editor: Photorealistic Animatable Gaussian Head Avatar Editor

Add code
Jan 17, 2025
Viaarxiv icon

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Add code
Jan 08, 2025
Figure 1 for MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data
Figure 2 for MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data
Figure 3 for MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data
Figure 4 for MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data
Viaarxiv icon

SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis

Add code
Dec 28, 2024
Figure 1 for SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis
Figure 2 for SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis
Figure 3 for SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis
Figure 4 for SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis
Viaarxiv icon

Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking

Add code
Dec 23, 2024
Viaarxiv icon

UniHOI: Learning Fast, Dense and Generalizable 4D Reconstruction for Egocentric Hand Object Interaction Videos

Add code
Nov 14, 2024
Viaarxiv icon

ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images

Add code
Oct 31, 2024
Viaarxiv icon

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining

Add code
Oct 01, 2024
Viaarxiv icon

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Add code
Sep 06, 2024
Figure 1 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 2 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 3 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 4 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Viaarxiv icon