speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

StutterZero and StutterFormer: End-to-End Speech Conversion for Stuttering Transcription and Correction

Add code
Oct 21, 2025
Viaarxiv icon

Decoding the Ear: A Framework for Objectifying Expressiveness from Human Preference Through Efficient Alignment

Add code
Oct 23, 2025
Viaarxiv icon

Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Add code
Oct 14, 2025
Viaarxiv icon

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Add code
Oct 10, 2025
Viaarxiv icon

Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Add code
Oct 09, 2025
Viaarxiv icon

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Add code
Oct 08, 2025
Viaarxiv icon

Bloodroot: When Watermarking Turns Poisonous For Stealthy Backdoor

Add code
Oct 09, 2025
Viaarxiv icon

Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation

Add code
Oct 08, 2025
Figure 1 for Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation
Figure 2 for Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation
Figure 3 for Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation
Figure 4 for Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation
Viaarxiv icon

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models

Add code
Oct 06, 2025
Figure 1 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 2 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 3 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 4 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Viaarxiv icon

Machine Unlearning in Speech Emotion Recognition via Forget Set Alone

Add code
Oct 05, 2025
Figure 1 for Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
Figure 2 for Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
Figure 3 for Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
Viaarxiv icon