speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Color-based Emotion Representation for Speech Emotion Recognition

Add code
Feb 18, 2026
Viaarxiv icon

A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations

Add code
Feb 26, 2026
Viaarxiv icon

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Add code
Feb 21, 2026
Viaarxiv icon

Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks

Add code
Feb 19, 2026
Viaarxiv icon

Joint Enhancement and Classification using Coupled Diffusion Models of Signals and Logits

Add code
Feb 17, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon

TurkicNLP: An NLP Toolkit for Turkic Languages

Add code
Feb 22, 2026
Viaarxiv icon

ViSpeechFormer: A Phonemic Approach for Vietnamese Automatic Speech Recognition

Add code
Feb 10, 2026
Viaarxiv icon

Speech to Speech Synthesis for Voice Impersonation

Add code
Feb 13, 2026
Viaarxiv icon

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most

Add code
Feb 16, 2026
Viaarxiv icon