speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

A Holistic Framework for Robust Bangla ASR and Speaker Diarization with Optimized VAD and CTC Alignment

Add code
Feb 26, 2026
Viaarxiv icon

Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing

Add code
Feb 26, 2026
Viaarxiv icon

Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment

Add code
Feb 26, 2026
Viaarxiv icon

A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations

Add code
Feb 26, 2026
Viaarxiv icon

TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition

Add code
Feb 25, 2026
Viaarxiv icon

Robust Long-Form Bangla Speech Processing: Automatic Speech Recognition and Speaker Diarization

Add code
Feb 25, 2026
Viaarxiv icon

Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration

Add code
Feb 25, 2026
Viaarxiv icon

iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis

Add code
Feb 25, 2026
Viaarxiv icon

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Add code
Feb 24, 2026
Viaarxiv icon

Continuous Telemonitoring of Heart Failure using Personalised Speech Dynamics

Add code
Feb 25, 2026
Viaarxiv icon