Speaker Separation


Speaker separation is the process of separating and isolating individual speakers in audio recordings with multiple speakers.

Plug-and-Steer: Decoupling Separation and Selection in Audio-Visual Target Speaker Extraction

Add code
Mar 20, 2026
Viaarxiv icon

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Add code
Mar 17, 2026
Viaarxiv icon

When Contextual Inference Fails: Cancelability in Interactive Instruction Following

Add code
Mar 20, 2026
Viaarxiv icon

VorTEX: Various overlap ratio for Target speech EXtraction

Add code
Mar 17, 2026
Viaarxiv icon

HRTF-guided Binaural Target Speaker Extraction with Real-World Validation

Add code
Mar 17, 2026
Viaarxiv icon

What Counts as Real? Speech Restoration and Voice Quality Conversion Pose New Challenges to Deepfake Detection

Add code
Mar 14, 2026
Viaarxiv icon

Multi-View Based Audio Visual Target Speaker Extraction

Add code
Mar 11, 2026
Viaarxiv icon

Causal Prosody Mediation for Text-to-Speech:Counterfactual Training of Duration, Pitch, and Energy in FastSpeech2

Add code
Mar 12, 2026
Viaarxiv icon

Mask2Flow-TSE: Two-Stage Target Speaker Extraction with Masking and Flow Matching

Add code
Mar 13, 2026
Viaarxiv icon

SommBench: Assessing Sommelier Expertise of Language Models

Add code
Mar 12, 2026
Viaarxiv icon