speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

Add code
Apr 20, 2026
Viaarxiv icon

FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs

Add code
Apr 20, 2026
Viaarxiv icon

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

Add code
Apr 19, 2026
Viaarxiv icon

Diffusion Language Models for Speech Recognition

Add code
Apr 15, 2026
Viaarxiv icon

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

Add code
Apr 16, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS

Add code
Apr 13, 2026
Viaarxiv icon

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

BlasBench: An Open Benchmark for Irish Speech Recognition

Add code
Apr 12, 2026
Viaarxiv icon

Empowering Video Translation using Multimodal Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon