speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Tagarela - A Portuguese speech dataset from podcasts

Add code
Mar 16, 2026
Viaarxiv icon

On the Emotion Understanding of Synthesized Speech

Add code
Mar 17, 2026
Viaarxiv icon

Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR

Add code
Mar 17, 2026
Viaarxiv icon

Trade-offs Between Capacity and Robustness in Neural Audio Codecs for Adversarially Robust Speech Recognition

Add code
Mar 10, 2026
Viaarxiv icon

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition

Add code
Mar 10, 2026
Viaarxiv icon

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Add code
Mar 02, 2026
Viaarxiv icon

Bootstrapping Audiovisual Speech Recognition in Zero-AV-Resource Scenarios with Synthetic Visual Data

Add code
Mar 09, 2026
Viaarxiv icon

Learnable Pulse Accumulation for On-Device Speech Recognition: How Much Attention Do You Need?

Add code
Mar 11, 2026
Viaarxiv icon

VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs

Add code
Mar 09, 2026
Viaarxiv icon

FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System

Add code
Mar 11, 2026
Viaarxiv icon