speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

asr_eval: Algorithms and tools for multi-reference and streaming speech recognition evaluation

Add code
Jan 28, 2026
Viaarxiv icon

Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization

Add code
Jan 30, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon

Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect

Add code
Feb 03, 2026
Viaarxiv icon

ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

Multilingual Extraction and Recognition of Implicit Discourse Relations in Speech and Text

Add code
Feb 04, 2026
Viaarxiv icon

WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Add code
Feb 02, 2026
Viaarxiv icon

MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA

Add code
Feb 01, 2026
Viaarxiv icon

Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER

Add code
Jan 29, 2026
Viaarxiv icon

Reducing Prompt Sensitivity in LLM-based Speech Recognition Through Learnable Projection

Add code
Jan 28, 2026
Viaarxiv icon