speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Revealing the Role of Audio Channels in ASR Performance Degradation

Add code
Aug 12, 2025
Figure 1 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 2 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 3 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 4 for Revealing the Role of Audio Channels in ASR Performance Degradation
Viaarxiv icon

Lessons Learnt: Revisit Key Training Strategies for Effective Speech Emotion Recognition in the Wild

Add code
Aug 10, 2025
Viaarxiv icon

Continuous Saudi Sign Language Recognition: A Vision Transformer Approach

Add code
Sep 03, 2025
Viaarxiv icon

Depression diagnosis from patient interviews using multimodal machine learning

Add code
Aug 26, 2025
Viaarxiv icon

FlexCTC: GPU-powered CTC Beam Decoding With Advanced Contextual Abilities

Add code
Aug 13, 2025
Viaarxiv icon

Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults

Add code
Aug 12, 2025
Viaarxiv icon

TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree

Add code
Aug 12, 2025
Figure 1 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 2 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 3 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 4 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Viaarxiv icon

Large Language Model Data Generation for Enhanced Intent Recognition in German Speech

Add code
Aug 08, 2025
Viaarxiv icon

A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions

Add code
Aug 11, 2025
Viaarxiv icon

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Add code
Aug 12, 2025
Viaarxiv icon