Speaker Diarization


Speaker diarization is the process of segmenting and clustering speech signals to identify different speakers in an audio recording.

Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models

Add code
Jun 06, 2025
Viaarxiv icon

Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency Modeling

Add code
Jun 05, 2025
Viaarxiv icon

Pretraining Multi-Speaker Identification for Neural Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon

Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon

Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

Add code
May 28, 2025
Viaarxiv icon

AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition

Add code
May 29, 2025
Viaarxiv icon

Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge

Add code
May 22, 2025
Viaarxiv icon

VoxRAG: A Step Toward Transcription-Free RAG Systems in Spoken Question Answering

Add code
May 22, 2025
Viaarxiv icon

HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification

Add code
May 22, 2025
Viaarxiv icon

Multi-Stage Speaker Diarization for Noisy Classrooms

Add code
May 16, 2025
Viaarxiv icon