speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Hearing the Unspoken: Language Model Priors for Acoustic Adversarial Attacks

Add code
Jun 05, 2026
Viaarxiv icon

TRADE: Transducer-Augmented Decoder for Speech LLM

Add code
Jun 07, 2026
Viaarxiv icon

FiLM-Based Speaker Conditioning of a SpeechLLM for Pathological Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Multi-task Learning is Not Enough: Representational Entanglement in Dual-output Second Language Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Assessing True Generalisability of Audio-Visual Speech Recognisers

Add code
Jun 05, 2026
Viaarxiv icon

M2S-AVSR: Modality-aware Multi-view Self-supervised Representation for Robust Audio-Visual Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

The Lipreading Gap: Do VSR Models Perceive Visual Speech Like Human Lipreaders?

Add code
Jun 05, 2026
Viaarxiv icon

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Learning Emotion-discriminative Representations for Zero-Shot Cross-lingual Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Geometric Second-Order Feature Correlation Learning for Self-Supervised Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon