speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus

Add code
Oct 26, 2025
Figure 1 for A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
Figure 2 for A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
Figure 3 for A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
Figure 4 for A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
Viaarxiv icon

Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges

Add code
Oct 22, 2025
Viaarxiv icon

StutterZero and StutterFormer: End-to-End Speech Conversion for Stuttering Transcription and Correction

Add code
Oct 21, 2025
Viaarxiv icon

MedVoiceBias: A Controlled Study of Audio LLM Behavior in Clinical Decision-Making

Add code
Nov 10, 2025
Viaarxiv icon

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Add code
Oct 10, 2025
Figure 1 for Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Figure 2 for Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Figure 3 for Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Figure 4 for Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Viaarxiv icon

EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models

Add code
Oct 26, 2025
Viaarxiv icon

Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Add code
Oct 09, 2025
Viaarxiv icon

Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Add code
Oct 14, 2025
Viaarxiv icon

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Add code
Oct 08, 2025
Viaarxiv icon

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models

Add code
Oct 06, 2025
Figure 1 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 2 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 3 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Figure 4 for UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Viaarxiv icon