speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

CAMEO: Collection of Multilingual Emotional Speech Corpora

Add code
May 16, 2025
Viaarxiv icon

Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR

Add code
May 19, 2025
Viaarxiv icon

Inclusivity of AI Speech in Healthcare: A Decade Look Back

Add code
May 15, 2025
Viaarxiv icon

LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models

Add code
May 16, 2025
Viaarxiv icon

On Multilingual Encoder Language Model Compression for Low-Resource Languages

Add code
May 22, 2025
Viaarxiv icon

LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors

Add code
May 16, 2025
Viaarxiv icon

FeatureSense: Protecting Speaker Attributes in Always-On Audio Sensing System

Add code
May 30, 2025
Viaarxiv icon

Multi-Stage Speaker Diarization for Noisy Classrooms

Add code
May 16, 2025
Viaarxiv icon

Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement

Add code
May 07, 2025
Viaarxiv icon

Pretraining Multi-Speaker Identification for Neural Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon