speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization

Add code
Apr 21, 2026
Viaarxiv icon

Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps

Add code
Apr 21, 2026
Viaarxiv icon

UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction

Add code
Apr 21, 2026
Viaarxiv icon

Where Do Self-Supervised Speech Models Become Unfair?

Add code
Apr 20, 2026
Viaarxiv icon

Hard to Be Heard: Phoneme-Level ASR Analysis of Phonologically Complex, Low-Resource Endangered Languages

Add code
Apr 20, 2026
Viaarxiv icon

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

Add code
Apr 20, 2026
Viaarxiv icon

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

Add code
Apr 19, 2026
Viaarxiv icon

FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs

Add code
Apr 20, 2026
Viaarxiv icon

Diffusion Language Models for Speech Recognition

Add code
Apr 15, 2026
Viaarxiv icon

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS

Add code
Apr 13, 2026
Viaarxiv icon