speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Prototype-Based Disentanglement for Controllable Dysarthric Speech Synthesis

Add code
Feb 09, 2026
Viaarxiv icon

Enabling Automatic Disordered Speech Recognition: An Impaired Speech Dataset in the Akan Language

Add code
Feb 05, 2026
Viaarxiv icon

D-ORCA: Dialogue-Centric Optimization for Robust Audio-Visual Captioning

Add code
Feb 08, 2026
Viaarxiv icon

B-GRPO: Unsupervised Speech Emotion Recognition based on Batched-Group Relative Policy Optimization

Add code
Feb 06, 2026
Viaarxiv icon

ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

Frontend Token Enhancement for Token-Based Speech Recognition

Add code
Feb 04, 2026
Viaarxiv icon

Equipping LLM with Directional Multi-Talker Speech Understanding Capabilities

Add code
Feb 06, 2026
Viaarxiv icon

Speech Emotion Recognition Leveraging OpenAI's Whisper Representations and Attentive Pooling Methods

Add code
Feb 05, 2026
Viaarxiv icon

Beyond the Utterance: An Empirical Study of Very Long Context Speech Recognition

Add code
Feb 04, 2026
Viaarxiv icon

Universal Robust Speech Adaptation for Cross-Domain Speech Recognition and Enhancement

Add code
Feb 04, 2026
Viaarxiv icon