speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Assessing the Impact of Speaker Identity in Speech Spoofing Detection

Add code
Feb 24, 2026
Viaarxiv icon

Speech Emotion Recognition Leveraging OpenAI's Whisper Representations and Attentive Pooling Methods

Add code
Feb 05, 2026
Viaarxiv icon

BBPE16: UTF-16-based byte-level byte-pair encoding for improved multilingual speech recognition

Add code
Feb 02, 2026
Viaarxiv icon

Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization

Add code
Feb 03, 2026
Viaarxiv icon

Equipping LLM with Directional Multi-Talker Speech Understanding Capabilities

Add code
Feb 06, 2026
Viaarxiv icon

RE-LLM: Refining Empathetic Speech-LLM Responses by Integrating Emotion Nuance

Add code
Feb 11, 2026
Viaarxiv icon

D-ORCA: Dialogue-Centric Optimization for Robust Audio-Visual Captioning

Add code
Feb 08, 2026
Viaarxiv icon

Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages

Add code
Feb 01, 2026
Viaarxiv icon

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

From Hallucination to Articulation: Language Model-Driven Losses for Ultra Low-Bitrate Neural Speech Coding

Add code
Feb 05, 2026
Viaarxiv icon