Picture for Ziyao Zeng

Ziyao Zeng

Coffee: Controllable Diffusion Fine-tuning

Add code
Nov 18, 2025
Figure 1 for Coffee: Controllable Diffusion Fine-tuning
Figure 2 for Coffee: Controllable Diffusion Fine-tuning
Figure 3 for Coffee: Controllable Diffusion Fine-tuning
Figure 4 for Coffee: Controllable Diffusion Fine-tuning
Viaarxiv icon

ProtoDepth: Unsupervised Continual Depth Completion with Prototypes

Add code
Mar 17, 2025
Figure 1 for ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Figure 2 for ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Figure 3 for ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Figure 4 for ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Viaarxiv icon

Efficient Interactive 3D Multi-Object Removal

Add code
Jan 30, 2025
Figure 1 for Efficient Interactive 3D Multi-Object Removal
Figure 2 for Efficient Interactive 3D Multi-Object Removal
Figure 3 for Efficient Interactive 3D Multi-Object Removal
Figure 4 for Efficient Interactive 3D Multi-Object Removal
Viaarxiv icon

PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation

Add code
Nov 24, 2024
Figure 1 for PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Figure 2 for PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Figure 3 for PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Figure 4 for PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Viaarxiv icon

RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

Add code
Oct 03, 2024
Figure 1 for RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Figure 2 for RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Figure 3 for RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Figure 4 for RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Viaarxiv icon

NeuroBind: Towards Unified Multimodal Representations for Neural Signals

Add code
Jul 19, 2024
Figure 1 for NeuroBind: Towards Unified Multimodal Representations for Neural Signals
Figure 2 for NeuroBind: Towards Unified Multimodal Representations for Neural Signals
Figure 3 for NeuroBind: Towards Unified Multimodal Representations for Neural Signals
Figure 4 for NeuroBind: Towards Unified Multimodal Representations for Neural Signals
Viaarxiv icon

WorDepth: Variational Language Prior for Monocular Depth Estimation

Add code
Apr 05, 2024
Figure 1 for WorDepth: Variational Language Prior for Monocular Depth Estimation
Figure 2 for WorDepth: Variational Language Prior for Monocular Depth Estimation
Figure 3 for WorDepth: Variational Language Prior for Monocular Depth Estimation
Figure 4 for WorDepth: Variational Language Prior for Monocular Depth Estimation
Viaarxiv icon

Binding Touch to Everything: Learning Unified Multimodal Tactile Representations

Add code
Jan 31, 2024
Viaarxiv icon

iQuery: Instruments as Queries for Audio-Visual Sound Separation

Add code
Dec 08, 2022
Figure 1 for iQuery: Instruments as Queries for Audio-Visual Sound Separation
Figure 2 for iQuery: Instruments as Queries for Audio-Visual Sound Separation
Figure 3 for iQuery: Instruments as Queries for Audio-Visual Sound Separation
Figure 4 for iQuery: Instruments as Queries for Audio-Visual Sound Separation
Viaarxiv icon

PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning

Add code
Nov 21, 2022
Figure 1 for PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning
Figure 2 for PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning
Figure 3 for PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning
Figure 4 for PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning
Viaarxiv icon