speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

EmoTale: An Enacted Speech-emotion Dataset in Danish

Add code
Aug 20, 2025
Figure 1 for EmoTale: An Enacted Speech-emotion Dataset in Danish
Figure 2 for EmoTale: An Enacted Speech-emotion Dataset in Danish
Figure 3 for EmoTale: An Enacted Speech-emotion Dataset in Danish
Figure 4 for EmoTale: An Enacted Speech-emotion Dataset in Danish
Viaarxiv icon

Revealing the Role of Audio Channels in ASR Performance Degradation

Add code
Aug 12, 2025
Figure 1 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 2 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 3 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 4 for Revealing the Role of Audio Channels in ASR Performance Degradation
Viaarxiv icon

CarelessWhisper: Turning Whisper into a Causal Streaming Model

Add code
Aug 17, 2025
Viaarxiv icon

Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults

Add code
Aug 12, 2025
Figure 1 for Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults
Figure 2 for Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults
Figure 3 for Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults
Figure 4 for Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults
Viaarxiv icon

FlexCTC: GPU-powered CTC Beam Decoding With Advanced Contextual Abilities

Add code
Aug 13, 2025
Viaarxiv icon

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Add code
Sep 18, 2025
Viaarxiv icon

Large Language Model Data Generation for Enhanced Intent Recognition in German Speech

Add code
Aug 08, 2025
Viaarxiv icon

TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree

Add code
Aug 12, 2025
Figure 1 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 2 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 3 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 4 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Viaarxiv icon

A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions

Add code
Aug 11, 2025
Viaarxiv icon

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Add code
Aug 12, 2025
Viaarxiv icon