speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Add code
Feb 02, 2026
Viaarxiv icon

Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect

Add code
Feb 03, 2026
Viaarxiv icon

VowelPrompt: Hearing Speech Emotions from Text via Vowel-level Prosodic Augmentation

Add code
Feb 06, 2026
Viaarxiv icon

Multilingual Extraction and Recognition of Implicit Discourse Relations in Speech and Text

Add code
Feb 04, 2026
Viaarxiv icon

Decoding Ambiguous Emotions with Test-Time Scaling in Audio-Language Models

Add code
Feb 01, 2026
Viaarxiv icon

Semantics-Aware Generative Latent Data Augmentation for Learning in Low-Resource Domains

Add code
Feb 02, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon

Uncertainty-Aware Multimodal Emotion Recognition through Dirichlet Parameterization

Add code
Feb 09, 2026
Viaarxiv icon

ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

DementiaBank-Emotion: A Multi-Rater Emotion Annotation Corpus for Alzheimer's Disease Speech (Version 1.0)

Add code
Feb 04, 2026
Viaarxiv icon