Picture for Zhuowen Tu

Zhuowen Tu

Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

Add code
Apr 28, 2024
Figure 1 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 2 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 3 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Figure 4 for Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Apr 03, 2024
Viaarxiv icon

HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data

Add code
Mar 18, 2024
Figure 1 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 2 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 3 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 4 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Viaarxiv icon

Bayesian Diffusion Models for 3D Shape Reconstruction

Add code
Mar 11, 2024
Figure 1 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 2 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 3 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 4 for Bayesian Diffusion Models for 3D Shape Reconstruction
Viaarxiv icon

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Mar 05, 2024
Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Add code
Mar 04, 2024
Figure 1 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 2 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 3 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 4 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Viaarxiv icon

AffordanceLLM: Grounding Affordance from Vision Language Models

Add code
Jan 12, 2024
Figure 1 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 2 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 3 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 4 for AffordanceLLM: Grounding Affordance from Vision Language Models
Viaarxiv icon

Restoration by Generation with Constrained Priors

Add code
Dec 28, 2023
Figure 1 for Restoration by Generation with Constrained Priors
Figure 2 for Restoration by Generation with Constrained Priors
Figure 3 for Restoration by Generation with Constrained Priors
Figure 4 for Restoration by Generation with Constrained Priors
Viaarxiv icon

TokenCompose: Grounding Diffusion with Token-level Supervision

Add code
Dec 06, 2023
Figure 1 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 2 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 3 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 4 for TokenCompose: Grounding Diffusion with Token-level Supervision
Viaarxiv icon

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Add code
Nov 15, 2023
Figure 1 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 2 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 3 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 4 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Viaarxiv icon