speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Diffusion Language Models for Speech Recognition

Add code
Apr 15, 2026
Viaarxiv icon

Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization

Add code
Apr 21, 2026
Viaarxiv icon

Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps

Add code
Apr 21, 2026
Viaarxiv icon

ATIR: Towards Audio-Text Interleaved Contextual Retrieval

Add code
Apr 22, 2026
Viaarxiv icon

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS

Add code
Apr 13, 2026
Viaarxiv icon

Where Do Self-Supervised Speech Models Become Unfair?

Add code
Apr 20, 2026
Viaarxiv icon

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

Add code
Apr 19, 2026
Viaarxiv icon

"This Wasn't Made for Me": Recentering User Experience and Emotional Impact in the Evaluation of ASR Bias

Add code
Apr 22, 2026
Viaarxiv icon

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction

Add code
Apr 21, 2026
Viaarxiv icon