speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Add code
Feb 21, 2026
Viaarxiv icon

The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek Editions

Add code
Mar 10, 2026
Viaarxiv icon

A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations

Add code
Feb 26, 2026
Viaarxiv icon

Speech to Speech Synthesis for Voice Impersonation

Add code
Feb 13, 2026
Viaarxiv icon

Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks

Add code
Feb 19, 2026
Viaarxiv icon

Joint Enhancement and Classification using Coupled Diffusion Models of Signals and Logits

Add code
Feb 17, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most

Add code
Feb 16, 2026
Viaarxiv icon

Team RAS in 10th ABAW Competition: Multimodal Valence and Arousal Estimation Approach

Add code
Mar 13, 2026
Viaarxiv icon

CLAP-Based Automatic Word Naming Recognition in Post-Stroke Aphasia

Add code
Feb 16, 2026
Viaarxiv icon