speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech Interaction

Add code
Mar 11, 2026
Viaarxiv icon

Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

Add code
Mar 08, 2026
Viaarxiv icon

Team RAS in 10th ABAW Competition: Multimodal Valence and Arousal Estimation Approach

Add code
Mar 13, 2026
Viaarxiv icon

Acoustic and Semantic Modeling of Emotion in Spoken Language

Add code
Mar 10, 2026
Viaarxiv icon

Learning Multiple Utterance-Level Attribute Representations with a Unified Speech Encoder

Add code
Mar 09, 2026
Viaarxiv icon

Federated Heterogeneous Language Model Optimization for Hybrid Automatic Speech Recognition

Add code
Mar 05, 2026
Viaarxiv icon

Beyond Word Error Rate: Auditing the Diversity Tax in Speech Recognition through Dataset Cartography

Add code
Mar 05, 2026
Viaarxiv icon

Robust LLM-based Audio-Visual Speech Recognition with Sparse Modality Alignment and Visual Unit-Guided Refinement

Add code
Mar 04, 2026
Viaarxiv icon

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Add code
Mar 02, 2026
Viaarxiv icon

The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek Editions

Add code
Mar 10, 2026
Viaarxiv icon