Picture for Sida Peng

Sida Peng

Zhejiang University

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Add code
Jan 06, 2026
Viaarxiv icon

UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass

Add code
Jan 03, 2026
Viaarxiv icon

Split4D: Decomposed 4D Scene Reconstruction Without Video Segmentation

Add code
Dec 28, 2025
Viaarxiv icon

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Add code
Dec 23, 2025
Viaarxiv icon

StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model

Add code
Nov 19, 2025
Figure 1 for StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Figure 2 for StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Figure 3 for StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Figure 4 for StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Viaarxiv icon

Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers

Add code
Oct 08, 2025
Figure 1 for Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
Figure 2 for Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
Figure 3 for Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
Figure 4 for Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
Viaarxiv icon

UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction

Add code
Oct 02, 2025
Figure 1 for UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Figure 2 for UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Figure 3 for UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Figure 4 for UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Viaarxiv icon

One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation

Add code
Sep 09, 2025
Figure 1 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 2 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 3 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Figure 4 for One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Viaarxiv icon

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Add code
Jul 17, 2025
Viaarxiv icon

Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation

Add code
Jul 15, 2025
Viaarxiv icon