speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Speaker-Aware Simulation Improves Conversational Speech Recognition

Add code
Feb 04, 2026
Viaarxiv icon

Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization

Add code
Feb 03, 2026
Viaarxiv icon

Mixture-of-Experts with Intermediate CTC Supervision for Accented Speech Recognition

Add code
Feb 02, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon

BBPE16: UTF-16-based byte-level byte-pair encoding for improved multilingual speech recognition

Add code
Feb 02, 2026
Viaarxiv icon

From Hallucination to Articulation: Language Model-Driven Losses for Ultra Low-Bitrate Neural Speech Coding

Add code
Feb 05, 2026
Viaarxiv icon

VowelPrompt: Hearing Speech Emotions from Text via Vowel-level Prosodic Augmentation

Add code
Feb 06, 2026
Viaarxiv icon

Uncertainty-Aware Multimodal Emotion Recognition through Dirichlet Parameterization

Add code
Feb 09, 2026
Viaarxiv icon

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages

Add code
Feb 01, 2026
Viaarxiv icon