speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

Add code
Apr 19, 2026
Viaarxiv icon

UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction

Add code
Apr 21, 2026
Viaarxiv icon

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS

Add code
Apr 13, 2026
Viaarxiv icon

Hard to Be Heard: Phoneme-Level ASR Analysis of Phonologically Complex, Low-Resource Endangered Languages

Add code
Apr 20, 2026
Viaarxiv icon

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

Add code
Apr 20, 2026
Viaarxiv icon

FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs

Add code
Apr 20, 2026
Viaarxiv icon

BlasBench: An Open Benchmark for Irish Speech Recognition

Add code
Apr 12, 2026
Viaarxiv icon

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

Add code
Apr 16, 2026
Viaarxiv icon