speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

Add code
May 30, 2025
Viaarxiv icon

Beyond Manual Transcripts: The Potential of Automated Speech Recognition Errors in Improving Alzheimer's Disease Detection

Add code
May 26, 2025
Viaarxiv icon

Pseudo Labels-based Neural Speech Enhancement for the AVSR Task in the MISP-Meeting Challenge

Add code
May 30, 2025
Viaarxiv icon

Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition

Add code
May 29, 2025
Viaarxiv icon

MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge

Add code
May 30, 2025
Viaarxiv icon

BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System

Add code
May 29, 2025
Viaarxiv icon

EmoSphere-SER: Enhancing Speech Emotion Recognition Through Spherical Representation with Auxiliary Classification

Add code
May 26, 2025
Viaarxiv icon

PSRB: A Comprehensive Benchmark for Evaluating Persian ASR Systems

Add code
May 27, 2025
Viaarxiv icon

WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper

Add code
May 25, 2025
Viaarxiv icon

Topological Deep Learning for Speech Data

Add code
May 27, 2025
Viaarxiv icon