speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

AfriVox-v2: A Domain-Verticalized Benchmark for In-the-Wild African Speech Recognition

Add code
May 05, 2026
Viaarxiv icon

When Audio-Language Models Fail to Leverage Multimodal Context for Dysarthric Speech Recognition

Add code
May 04, 2026
Viaarxiv icon

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs

Add code
May 05, 2026
Viaarxiv icon

RAS: a Reliability Oriented Metric for Automatic Speech Recognition

Add code
Apr 28, 2026
Viaarxiv icon

Dimensionality-Aware Anomaly Detection in Learned Representations of Self-Supervised Speech Models

Add code
May 04, 2026
Viaarxiv icon

Audio-Visual Intelligence in Large Foundation Models

Add code
May 05, 2026
Viaarxiv icon

MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings

Add code
May 04, 2026
Viaarxiv icon

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

Add code
Apr 28, 2026
Viaarxiv icon

Unrequited Emotions: Investigating the Gaps in Motivation and Practice in Speech Emotion Recognition Research

Add code
Apr 28, 2026
Viaarxiv icon

Evaluation of Automatic Speech Recognition Using Generative Large Language Models

Add code
Apr 23, 2026
Viaarxiv icon