speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models

Add code
Jun 06, 2025
Viaarxiv icon

LLM-based phoneme-to-grapheme for phoneme-based speech recognition

Add code
Jun 05, 2025
Viaarxiv icon

Technical Report: A Practical Guide to Kaldi ASR Optimization

Add code
Jun 08, 2025
Viaarxiv icon

EMO-Debias: Benchmarking Gender Debiasing Techniques in Multi-Label Speech Emotion Recognition

Add code
Jun 05, 2025
Viaarxiv icon

Joint ASR and Speaker Role Tagging with Serialized Output Training

Add code
Jun 12, 2025
Viaarxiv icon

Improving Named Entity Transcription with Contextual LLM-based Revision

Add code
Jun 12, 2025
Viaarxiv icon

Hybrid Deep Learning and Signal Processing for Arabic Dialect Recognition in Low-Resource Settings

Add code
Jun 26, 2025
Viaarxiv icon

CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition

Add code
Jun 06, 2025
Viaarxiv icon

Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and Algorithms

Add code
Jun 12, 2025
Viaarxiv icon

(SimPhon Speech Test): A Data-Driven Method for In Silico Design and Validation of a Phonetically Balanced Speech Test

Add code
Jun 13, 2025
Viaarxiv icon