speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

A Comprehensive Analysis of Tokenization and Self-Supervised Learning in End-to-End Automatic Speech Recognition applied on French Language

Add code
May 05, 2026
Viaarxiv icon

A Paradigm for Interpreting Metrics and Identifying Critical Errors in Automatic Speech Recognition

Add code
May 05, 2026
Viaarxiv icon

AfriVox-v2: A Domain-Verticalized Benchmark for In-the-Wild African Speech Recognition

Add code
May 05, 2026
Viaarxiv icon

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs

Add code
May 05, 2026
Viaarxiv icon

When Audio-Language Models Fail to Leverage Multimodal Context for Dysarthric Speech Recognition

Add code
May 04, 2026
Viaarxiv icon

Audio-Visual Intelligence in Large Foundation Models

Add code
May 05, 2026
Viaarxiv icon

Dimensionality-Aware Anomaly Detection in Learned Representations of Self-Supervised Speech Models

Add code
May 04, 2026
Viaarxiv icon

MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings

Add code
May 04, 2026
Viaarxiv icon

RAS: a Reliability Oriented Metric for Automatic Speech Recognition

Add code
Apr 28, 2026
Viaarxiv icon

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

Add code
Apr 28, 2026
Viaarxiv icon