speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

RO-N3WS: Enhancing Generalization in Low-Resource ASR with Diverse Romanian Speech Benchmarks

Add code
Mar 02, 2026
Viaarxiv icon

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

Add code
Mar 02, 2026
Viaarxiv icon

Learning Multiple Utterance-Level Attribute Representations with a Unified Speech Encoder

Add code
Mar 09, 2026
Viaarxiv icon

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

Add code
Mar 19, 2026
Viaarxiv icon

SpectroFusion-ViT: A Lightweight Transformer for Speech Emotion Recognition Using Harmonic Mel-Chroma Fusion

Add code
Feb 28, 2026
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-9 MCoRec Challenge

Add code
Mar 02, 2026
Viaarxiv icon

Polynomial Mixing for Efficient Self-supervised Speech Encoders

Add code
Feb 28, 2026
Viaarxiv icon

Dialect and Gender Bias in YouTube's Spanish Captioning System

Add code
Feb 27, 2026
Viaarxiv icon

End-to-End Simultaneous Dysarthric Speech Reconstruction with Frame-Level Adaptor and Multiple Wait-k Knowledge Distillation

Add code
Mar 02, 2026
Viaarxiv icon

Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics

Add code
Feb 27, 2026
Viaarxiv icon