speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction

Add code
Jun 09, 2025
Viaarxiv icon

Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation

Add code
Jun 09, 2025
Viaarxiv icon

Bridging the Modality Gap: Softly Discretizing Audio Representation for LLM-based Automatic Speech Recognition

Add code
Jun 06, 2025
Viaarxiv icon

ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition

Add code
Jun 05, 2025
Viaarxiv icon

Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models

Add code
Jun 06, 2025
Viaarxiv icon

Uncovering the Functional Roles of Nonlinearity in Memory

Add code
Jun 09, 2025
Viaarxiv icon

LLM-based phoneme-to-grapheme for phoneme-based speech recognition

Add code
Jun 05, 2025
Viaarxiv icon

A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments

Add code
Jun 17, 2025
Viaarxiv icon

EMO-Debias: Benchmarking Gender Debiasing Techniques in Multi-Label Speech Emotion Recognition

Add code
Jun 05, 2025
Viaarxiv icon

CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition

Add code
Jun 06, 2025
Viaarxiv icon