speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications

Add code
Aug 25, 2025
Viaarxiv icon

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation

Add code
Aug 25, 2025
Viaarxiv icon

Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study

Add code
Aug 25, 2025
Viaarxiv icon

Depression diagnosis from patient interviews using multimodal machine learning

Add code
Aug 26, 2025
Viaarxiv icon

UniCoM: A Universal Code-Switching Speech Generator

Add code
Aug 21, 2025
Viaarxiv icon

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization

Add code
Aug 17, 2025
Viaarxiv icon

EmoTale: An Enacted Speech-emotion Dataset in Danish

Add code
Aug 20, 2025
Viaarxiv icon

Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance

Add code
Aug 25, 2025
Viaarxiv icon

What do Speech Foundation Models Learn? Analysis and Applications

Add code
Aug 17, 2025
Viaarxiv icon

CarelessWhisper: Turning Whisper into a Causal Streaming Model

Add code
Aug 17, 2025
Viaarxiv icon