speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Learning Emotion-discriminative Representations for Zero-Shot Cross-lingual Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Geometric Second-Order Feature Correlation Learning for Self-Supervised Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Beyond WER: A Paired Acoustic Stress Test for Ambient Clinical Scribes

Add code
Jun 04, 2026
Viaarxiv icon

Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

Add code
Jun 04, 2026
Viaarxiv icon

Age-Aware Adapter Tuning for Children's Speech Recognition

Add code
Jun 03, 2026
Viaarxiv icon

CoSTA: Cognitive-State-Conditioned TTS Data Augmentation Using ASR Transcripts for Alzheimer's Disease Detection

Add code
Jun 04, 2026
Viaarxiv icon

Speaker-Invariant Representation Learning for Spoofing Detection via Gradient Reversal and A Variational Information Bottleneck

Add code
Jun 07, 2026
Viaarxiv icon

Read What You Hear: Reference-Free Hypotheses Evaluation with Acoustic Discrepancy

Add code
Jun 03, 2026
Viaarxiv icon

Test-Time Compute Scaling for ASR with Depth-Conditioned Looped Transformers

Add code
Jun 03, 2026
Viaarxiv icon