speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition

Add code
Feb 25, 2026
Viaarxiv icon

Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing

Add code
Feb 26, 2026
Viaarxiv icon

Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment

Add code
Feb 26, 2026
Viaarxiv icon

Robust Long-Form Bangla Speech Processing: Automatic Speech Recognition and Speaker Diarization

Add code
Feb 25, 2026
Viaarxiv icon

A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations

Add code
Feb 26, 2026
Viaarxiv icon

Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration

Add code
Feb 25, 2026
Viaarxiv icon

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Add code
Feb 24, 2026
Viaarxiv icon

iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis

Add code
Feb 25, 2026
Viaarxiv icon

Pay Attention to CTC: Fast and Robust Pseudo-Labelling for Unified Speech Recognition

Add code
Feb 22, 2026
Viaarxiv icon

An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction

Add code
Feb 23, 2026
Viaarxiv icon