speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages

Add code
May 26, 2025
Viaarxiv icon

Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages

Add code
May 20, 2025
Viaarxiv icon

ABHINAYA -- A System for Speech Emotion Recognition In Naturalistic Conditions Challenge

Add code
May 23, 2025
Viaarxiv icon

Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically

Add code
May 26, 2025
Viaarxiv icon

KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization

Add code
May 26, 2025
Viaarxiv icon

Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language

Add code
May 20, 2025
Viaarxiv icon

CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR

Add code
May 24, 2025
Viaarxiv icon

CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training

Add code
May 23, 2025
Viaarxiv icon

Word Level Timestamp Generation for Automatic Speech Recognition and Translation

Add code
May 21, 2025
Viaarxiv icon

Building a Functional Machine Translation Corpus for Kpelle

Add code
May 24, 2025
Viaarxiv icon