speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Recognizing Co-Speech Gestures in-the-Wild

Add code
May 29, 2026
Viaarxiv icon

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Add code
May 28, 2026
Viaarxiv icon

Breaking the Script Barrier: Enabling Automatic Alignment for PoS-based ASR Error Analysis in Non-Latin Scripts

Add code
May 27, 2026
Viaarxiv icon

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation

Add code
May 27, 2026
Viaarxiv icon

FalAR: A Large-scale Speaker-Annotated European Portuguese Speech Corpus of Parliamentary Sessions

Add code
May 26, 2026
Viaarxiv icon

UNIQUE: Universal Top-k Sparse Attention for Training-free Inference and Sparsity-aware Training

Add code
May 26, 2026
Viaarxiv icon

Proactive for Uncertainty: Cause-Aware Error Diagnosis and Interactive Clarification for Spoken Dialogue Systems

Add code
May 25, 2026
Viaarxiv icon

Evaluation of Conversational Agents: Understanding Culture, Context and Environment in Emotion Detection

Add code
May 28, 2026
Viaarxiv icon

Multilingual Phonological Feature Recognition with Self-Supervised Speech Models

Add code
May 25, 2026
Viaarxiv icon

Hardware-Aware Federated Learning for Speech Emotion Recognition

Add code
May 23, 2026
Viaarxiv icon