Picture for Guanglu Song

Guanglu Song

Improving Joint Audio-Video Generation with Cross-Modal Context Learning

Add code
Mar 19, 2026
Viaarxiv icon

AR-CoPO: Align Autoregressive Video Generation with Contrastive Policy Optimization

Add code
Mar 18, 2026
Viaarxiv icon

Towards Seamless Borders: A Method for Mitigating Inconsistencies in Image Inpainting and Outpainting

Add code
Jun 14, 2025
Viaarxiv icon

ADT: Tuning Diffusion Models with Adversarial Supervision

Add code
Apr 15, 2025
Figure 1 for ADT: Tuning Diffusion Models with Adversarial Supervision
Figure 2 for ADT: Tuning Diffusion Models with Adversarial Supervision
Figure 3 for ADT: Tuning Diffusion Models with Adversarial Supervision
Figure 4 for ADT: Tuning Diffusion Models with Adversarial Supervision
Viaarxiv icon

High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning

Add code
Mar 28, 2025
Viaarxiv icon

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Add code
Dec 15, 2024
Figure 1 for VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Figure 2 for VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Figure 3 for VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Figure 4 for VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Viaarxiv icon

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Add code
Dec 12, 2024
Figure 1 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 2 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 3 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 4 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Viaarxiv icon

See Further When Clear: Curriculum Consistency Model

Add code
Dec 09, 2024
Figure 1 for See Further When Clear: Curriculum Consistency Model
Figure 2 for See Further When Clear: Curriculum Consistency Model
Figure 3 for See Further When Clear: Curriculum Consistency Model
Figure 4 for See Further When Clear: Curriculum Consistency Model
Viaarxiv icon

Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning

Add code
Oct 02, 2024
Figure 1 for Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
Figure 2 for Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
Figure 3 for Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
Figure 4 for Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
Viaarxiv icon

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Add code
Jun 17, 2024
Viaarxiv icon