Picture for Guanglu Song

Guanglu Song

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Add code
Jun 17, 2024
Viaarxiv icon

Phased Consistency Model

Add code
May 28, 2024
Figure 1 for Phased Consistency Model
Figure 2 for Phased Consistency Model
Figure 3 for Phased Consistency Model
Figure 4 for Phased Consistency Model
Viaarxiv icon

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

Add code
May 01, 2024
Figure 1 for Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Figure 2 for Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Figure 3 for Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Figure 4 for Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Viaarxiv icon

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Add code
Apr 19, 2024
Viaarxiv icon

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

Add code
Apr 08, 2024
Figure 1 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 2 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 3 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 4 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Viaarxiv icon

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Add code
Apr 04, 2024
Figure 1 for CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Figure 2 for CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Figure 3 for CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Figure 4 for CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Viaarxiv icon

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Add code
Mar 25, 2024
Figure 1 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 2 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 3 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 4 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Viaarxiv icon

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

Add code
Mar 20, 2024
Figure 1 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 2 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 3 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 4 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Viaarxiv icon

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Add code
Mar 19, 2024
Figure 1 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 2 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 3 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 4 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Viaarxiv icon

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Add code
Feb 01, 2024
Viaarxiv icon