Picture for Guanglu Song

Guanglu Song

Phased Consistency Model

Add code
May 28, 2024
Viaarxiv icon

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

May 01, 2024
Viaarxiv icon

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Add code
Apr 19, 2024
Viaarxiv icon

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

Add code
Apr 08, 2024
Figure 1 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 2 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 3 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Figure 4 for Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Viaarxiv icon

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Add code
Apr 04, 2024
Viaarxiv icon

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Add code
Mar 25, 2024
Figure 1 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 2 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 3 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Figure 4 for Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
Viaarxiv icon

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

Add code
Mar 20, 2024
Figure 1 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 2 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 3 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 4 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Viaarxiv icon

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Add code
Mar 19, 2024
Figure 1 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 2 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 3 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Figure 4 for FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Viaarxiv icon

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Add code
Feb 01, 2024
Viaarxiv icon

Towards Large-scale Masked Face Recognition

Oct 25, 2023
Figure 1 for Towards Large-scale Masked Face Recognition
Figure 2 for Towards Large-scale Masked Face Recognition
Figure 3 for Towards Large-scale Masked Face Recognition
Figure 4 for Towards Large-scale Masked Face Recognition
Viaarxiv icon