Speaker Diarization


Speaker diarization is the process of segmenting and clustering speech signals to identify different speakers in an audio recording.

From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization

Add code
Sep 18, 2025
Viaarxiv icon

Mitigating Intra-Speaker Variability in Diarization with Style-Controllable Speech Augmentation

Add code
Sep 18, 2025
Viaarxiv icon

Robust Target Speaker Diarization and Separation via Augmented Speaker Embedding Sampling

Add code
Aug 08, 2025
Viaarxiv icon

SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models

Add code
Aug 08, 2025
Viaarxiv icon

MOVER: Combining Multiple Meeting Recognition Systems

Add code
Aug 07, 2025
Viaarxiv icon

The TEA-ASLP System for Multilingual Conversational Speech Recognition and Speech Diarization in MLC-SLM 2025 Challenge

Add code
Jul 24, 2025
Viaarxiv icon

M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset

Add code
Jun 17, 2025
Viaarxiv icon

Exploring Speaker Diarization with Mixture of Experts

Add code
Jun 17, 2025
Viaarxiv icon

SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition

Add code
Jun 15, 2025
Viaarxiv icon

Do We Still Need Audio? Rethinking Speaker Diarization with a Text-Based Approach Using Multiple Prediction Models

Add code
Jun 12, 2025
Viaarxiv icon