speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

More Data, Fewer Diacritics: Scaling Arabic TTS

Add code
Mar 02, 2026
Viaarxiv icon

DARS: Dysarthria-Aware Rhythm-Style Synthesis for ASR Enhancement

Add code
Mar 02, 2026
Viaarxiv icon

RO-N3WS: Enhancing Generalization in Low-Resource ASR with Diverse Romanian Speech Benchmarks

Add code
Mar 02, 2026
Viaarxiv icon

Robust Long-Form Bangla Speech Processing: Automatic Speech Recognition and Speaker Diarization

Add code
Feb 25, 2026
Viaarxiv icon

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

Add code
Mar 02, 2026
Viaarxiv icon

Acoustic and Semantic Modeling of Emotion in Spoken Language

Add code
Mar 10, 2026
Viaarxiv icon

SpectroFusion-ViT: A Lightweight Transformer for Speech Emotion Recognition Using Harmonic Mel-Chroma Fusion

Add code
Feb 28, 2026
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-9 MCoRec Challenge

Add code
Mar 02, 2026
Viaarxiv icon

G-STAR: End-to-End Global Speaker-Tracking Attributed Recognition

Add code
Mar 11, 2026
Viaarxiv icon

Pay Attention to CTC: Fast and Robust Pseudo-Labelling for Unified Speech Recognition

Add code
Feb 22, 2026
Viaarxiv icon