Audio Visual Synchronization


JoyAvatar: Unlocking Highly Expressive Avatars via Harmonized Text-Audio Conditioning

Add code
Jan 31, 2026
Viaarxiv icon

LPIPS-AttnWav2Lip: Generic Audio-Driven lip synchronization for Talking Head Generation in the Wild

Add code
Jan 30, 2026
Viaarxiv icon

JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion

Add code
Jan 29, 2026
Viaarxiv icon

EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers

Add code
Jan 29, 2026
Viaarxiv icon

Beyond Lips: Integrating Gesture and Lip Cues for Robust Audio-visual Speaker Extraction

Add code
Jan 27, 2026
Viaarxiv icon

SkyReels-V3 Technique Report

Add code
Jan 24, 2026
Viaarxiv icon

LTX-2: Efficient Joint Audio-Visual Foundation Model

Add code
Jan 06, 2026
Viaarxiv icon

From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing

Add code
Dec 31, 2025
Viaarxiv icon

Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Add code
Dec 23, 2025
Viaarxiv icon

SyncAnyone: Implicit Disentanglement via Progressive Self-Correction for Lip-Syncing in the wild

Add code
Dec 25, 2025
Viaarxiv icon