speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System

Add code
May 29, 2025
Viaarxiv icon

MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge

Add code
May 30, 2025
Viaarxiv icon

Towards Pretraining Robust ASR Foundation Model with Acoustic-Aware Data Augmentation

Add code
May 27, 2025
Viaarxiv icon

NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding

Add code
May 28, 2025
Viaarxiv icon

Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

Add code
May 28, 2025
Viaarxiv icon

Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR

Add code
May 28, 2025
Viaarxiv icon

Advancing Hearing Assessment: An ASR-Based Frequency-Specific Speech Test for Diagnosing Presbycusis

Add code
May 28, 2025
Viaarxiv icon

Rhythm Features for Speaker Identification

Add code
Jun 07, 2025
Viaarxiv icon

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation

Add code
Jun 04, 2025
Viaarxiv icon

Developing a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering Choices

Add code
May 28, 2025
Viaarxiv icon