Picture for Zhuowen Tu

Zhuowen Tu

OmniControlNet: Dual-stage Integration for Conditional Image Generation

Add code
Jun 09, 2024
Figure 1 for OmniControlNet: Dual-stage Integration for Conditional Image Generation
Figure 2 for OmniControlNet: Dual-stage Integration for Conditional Image Generation
Figure 3 for OmniControlNet: Dual-stage Integration for Conditional Image Generation
Figure 4 for OmniControlNet: Dual-stage Integration for Conditional Image Generation
Viaarxiv icon

Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

Add code
Apr 28, 2024
Figure 1 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 2 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 3 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 4 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Apr 03, 2024
Figure 1 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 2 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 3 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 4 for On the Scalability of Diffusion-based Text-to-Image Generation
Viaarxiv icon

HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data

Add code
Mar 18, 2024
Figure 1 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 2 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 3 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 4 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Viaarxiv icon

Bayesian Diffusion Models for 3D Shape Reconstruction

Add code
Mar 11, 2024
Figure 1 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 2 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 3 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 4 for Bayesian Diffusion Models for 3D Shape Reconstruction
Viaarxiv icon

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Mar 05, 2024
Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Add code
Mar 04, 2024
Figure 1 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 2 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 3 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 4 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Viaarxiv icon

AffordanceLLM: Grounding Affordance from Vision Language Models

Add code
Jan 12, 2024
Viaarxiv icon

Restoration by Generation with Constrained Priors

Add code
Dec 28, 2023
Viaarxiv icon

TokenCompose: Grounding Diffusion with Token-level Supervision

Add code
Dec 06, 2023
Viaarxiv icon