speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Add code
Mar 02, 2026
Viaarxiv icon

Beyond Word Error Rate: Auditing the Diversity Tax in Speech Recognition through Dataset Cartography

Add code
Mar 05, 2026
Viaarxiv icon

Robust LLM-based Audio-Visual Speech Recognition with Sparse Modality Alignment and Visual Unit-Guided Refinement

Add code
Mar 04, 2026
Viaarxiv icon

Multimodal Emotion Recognition via Bi-directional Cross-Attention and Temporal Modeling

Add code
Mar 12, 2026
Viaarxiv icon

G-STAR: End-to-End Global Speaker-Tracking Attributed Recognition

Add code
Mar 11, 2026
Viaarxiv icon

Learning Multiple Utterance-Level Attribute Representations with a Unified Speech Encoder

Add code
Mar 09, 2026
Viaarxiv icon

SilentWear: an Ultra-Low Power Wearable System for EMG-based Silent Speech Recognition

Add code
Mar 04, 2026
Viaarxiv icon

Acoustic and Semantic Modeling of Emotion in Spoken Language

Add code
Mar 10, 2026
Viaarxiv icon

Beyond Deep Learning: Speech Segmentation and Phone Classification with Neural Assemblies

Add code
Mar 11, 2026
Viaarxiv icon

NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech Interaction

Add code
Mar 11, 2026
Viaarxiv icon