Talking Head Generation


Talking head generation is the process of generating videos of a person speaking based on an audio recording of their voice.

TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection

Add code
May 30, 2025
Viaarxiv icon

FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing

Add code
May 28, 2025
Viaarxiv icon

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

Add code
May 28, 2025
Viaarxiv icon

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations

Add code
May 26, 2025
Viaarxiv icon

KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Add code
May 01, 2025
Viaarxiv icon

IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos

Add code
Apr 29, 2025
Viaarxiv icon

Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation

Add code
Apr 25, 2025
Viaarxiv icon

Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis

Add code
Apr 18, 2025
Viaarxiv icon

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Add code
Apr 03, 2025
Viaarxiv icon

OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication

Add code
Apr 03, 2025
Viaarxiv icon