Picture for Zeming Li

Zeming Li

Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

Add code
May 12, 2025
Viaarxiv icon

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Add code
Jan 28, 2025
Viaarxiv icon

ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?

Add code
Dec 13, 2024
Figure 1 for ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?
Figure 2 for ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?
Figure 3 for ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?
Figure 4 for ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?
Viaarxiv icon

Multi-modal Relation Distillation for Unified 3D Representation Learning

Add code
Jul 19, 2024
Figure 1 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 2 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 3 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 4 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Viaarxiv icon

4K4DGen: Panoramic 4D Generation at 4K Resolution

Add code
Jun 19, 2024
Figure 1 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 2 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 3 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 4 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Viaarxiv icon

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

Add code
Jun 18, 2024
Viaarxiv icon

Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

Add code
Apr 18, 2024
Figure 1 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 2 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 3 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 4 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Viaarxiv icon

HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

Add code
Mar 06, 2024
Figure 1 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 2 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 3 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 4 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Viaarxiv icon

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

Add code
Jun 30, 2023
Figure 1 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 2 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 3 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 4 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Viaarxiv icon

Dynamic Grained Encoder for Vision Transformers

Add code
Jan 10, 2023
Viaarxiv icon