speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

SpeakerCard-1M: An Evidence-Grounded Speaker Card Corpus for In-the-Wild Speaker Verification

Add code
Jun 03, 2026
Viaarxiv icon

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation

Add code
May 27, 2026
Viaarxiv icon

Beyond the Mouth: Upper-Face Affective Cues in Audiovisual Sentence Recognition under Acoustic Uncertainty

Add code
May 30, 2026
Viaarxiv icon

FalAR: A Large-scale Speaker-Annotated European Portuguese Speech Corpus of Parliamentary Sessions

Add code
May 26, 2026
Viaarxiv icon

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Add code
May 28, 2026
Viaarxiv icon

UNIQUE: Universal Top-k Sparse Attention for Training-free Inference and Sparsity-aware Training

Add code
May 26, 2026
Viaarxiv icon

Proactive for Uncertainty: Cause-Aware Error Diagnosis and Interactive Clarification for Spoken Dialogue Systems

Add code
May 25, 2026
Viaarxiv icon

Convex Low-resource Accent-Robust Language Detection in Speech Recognition

Add code
May 22, 2026
Viaarxiv icon

Hardware-Aware Federated Learning for Speech Emotion Recognition

Add code
May 23, 2026
Viaarxiv icon

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

Add code
May 31, 2026
Viaarxiv icon