speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

NüshuVoice: Reviving the Voice of Endangered Nüshu with Pitch-Aware Text-to-Speech

Add code
Jun 08, 2026
Viaarxiv icon

Hearing the Unspoken: Language Model Priors for Acoustic Adversarial Attacks

Add code
Jun 05, 2026
Viaarxiv icon

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Assessing True Generalisability of Audio-Visual Speech Recognisers

Add code
Jun 05, 2026
Viaarxiv icon

The Lipreading Gap: Do VSR Models Perceive Visual Speech Like Human Lipreaders?

Add code
Jun 05, 2026
Viaarxiv icon

Learning Emotion-discriminative Representations for Zero-Shot Cross-lingual Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Geometric Second-Order Feature Correlation Learning for Self-Supervised Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Age-Aware Adapter Tuning for Children's Speech Recognition

Add code
Jun 03, 2026
Viaarxiv icon

Beyond WER: A Paired Acoustic Stress Test for Ambient Clinical Scribes

Add code
Jun 04, 2026
Viaarxiv icon

Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

Add code
Jun 04, 2026
Viaarxiv icon