speech


CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

Add code
Mar 03, 2026
Viaarxiv icon

Using Songs to Improve Kazakh Automatic Speech Recognition

Add code
Mar 03, 2026
Viaarxiv icon

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

Add code
Mar 02, 2026
Viaarxiv icon

RO-N3WS: Enhancing Generalization in Low-Resource ASR with Diverse Romanian Speech Benchmarks

Add code
Mar 02, 2026
Viaarxiv icon

When Spoof Detectors Travel: Evaluation Across 66 Languages in the Low-Resource Language Spoofing Corpus

Add code
Mar 02, 2026
Viaarxiv icon

What Exactly do Children Receive in Language Acquisition? A Case Study on CHILDES with Automated Detection of Filler-Gap Dependencies

Add code
Mar 02, 2026
Viaarxiv icon

Cognitive Prosthetic: An AI-Enabled Multimodal System for Episodic Recall in Knowledge Work

Add code
Mar 02, 2026
Viaarxiv icon

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Add code
Mar 02, 2026
Viaarxiv icon

More Data, Fewer Diacritics: Scaling Arabic TTS

Add code
Mar 02, 2026
Viaarxiv icon

Anatomy of the Modality Gap: Dissecting the Internal States of End-to-End Speech LLMs

Add code
Mar 02, 2026
Viaarxiv icon