Picture for Shaochong Jia

Shaochong Jia

An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models

Add code
Mar 25, 2024
Figure 1 for An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
Figure 2 for An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
Figure 3 for An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
Figure 4 for An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
Viaarxiv icon

Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net

Add code
Nov 28, 2023
Figure 1 for Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Figure 2 for Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Figure 3 for Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Figure 4 for Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Viaarxiv icon