speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Add code
Oct 14, 2025
Viaarxiv icon

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Add code
Oct 10, 2025
Viaarxiv icon

Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Add code
Oct 09, 2025
Viaarxiv icon

Bloodroot: When Watermarking Turns Poisonous For Stealthy Backdoor

Add code
Oct 09, 2025
Viaarxiv icon

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Add code
Oct 08, 2025
Viaarxiv icon

Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation

Add code
Oct 08, 2025
Viaarxiv icon

How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu

Add code
Oct 08, 2025
Viaarxiv icon

A Study of the Removability of Speaker-Adversarial Perturbations

Add code
Oct 10, 2025
Viaarxiv icon

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models

Add code
Oct 06, 2025
Viaarxiv icon

CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching

Add code
Oct 09, 2025
Viaarxiv icon