speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Vietnamese Automatic Speech Recognition: A Revisit

Add code
Mar 16, 2026
Viaarxiv icon

How Attention Shapes Emotion: A Comparative Study of Attention Mechanisms for Speech Emotion Recognition

Add code
Mar 16, 2026
Viaarxiv icon

RECOVER: Robust Entity Correction via agentic Orchestration of hypothesis Variants for Evidence-based Recovery

Add code
Mar 17, 2026
Viaarxiv icon

Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus

Add code
Mar 17, 2026
Viaarxiv icon

On the Emotion Understanding of Synthesized Speech

Add code
Mar 17, 2026
Viaarxiv icon

Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR

Add code
Mar 17, 2026
Viaarxiv icon

LLMs and Speech: Integration vs. Combination

Add code
Mar 16, 2026
Viaarxiv icon

Tagarela - A Portuguese speech dataset from podcasts

Add code
Mar 16, 2026
Viaarxiv icon

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

Add code
Mar 19, 2026
Viaarxiv icon

Dr. SHAP-AV: Decoding Relative Modality Contributions via Shapley Attribution in Audio-Visual Speech Recognition

Add code
Mar 12, 2026
Viaarxiv icon