speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition

Add code
May 29, 2025
Viaarxiv icon

Loquacious Set: 25,000 Hours of Transcribed and Diverse English Speech Recognition Data for Research and Commercial Use

Add code
May 27, 2025
Viaarxiv icon

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation

Add code
Jun 04, 2025
Viaarxiv icon

Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis

Add code
May 27, 2025
Viaarxiv icon

Dynamic Context-Aware Streaming Pretrained Language Model For Inverse Text Normalization

Add code
May 30, 2025
Viaarxiv icon

MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR

Add code
May 30, 2025
Viaarxiv icon

Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition

Add code
May 26, 2025
Viaarxiv icon

Robust fine-tuning of speech recognition models via model merging: application to disordered speech

Add code
May 26, 2025
Viaarxiv icon

Leveraging LLM for Stuttering Speech: A Unified Architecture Bridging Recognition and Event Detection

Add code
May 28, 2025
Viaarxiv icon

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

Add code
May 30, 2025
Viaarxiv icon