speech


Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Add code
Mar 02, 2026
Viaarxiv icon

End-to-End Simultaneous Dysarthric Speech Reconstruction with Frame-Level Adaptor and Multiple Wait-k Knowledge Distillation

Add code
Mar 02, 2026
Viaarxiv icon

DARS: Dysarthria-Aware Rhythm-Style Synthesis for ASR Enhancement

Add code
Mar 02, 2026
Viaarxiv icon

Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec

Add code
Mar 02, 2026
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-9 MCoRec Challenge

Add code
Mar 02, 2026
Viaarxiv icon

A SUPERB-Style Benchmark of Self-Supervised Speech Models for Audio Deepfake Detection

Add code
Mar 02, 2026
Viaarxiv icon

Conversational Speech Naturalness Predictor

Add code
Mar 02, 2026
Viaarxiv icon

Anatomy of the Modality Gap: Dissecting the Internal States of End-to-End Speech LLMs

Add code
Mar 02, 2026
Viaarxiv icon

What Exactly do Children Receive in Language Acquisition? A Case Study on CHILDES with Automated Detection of Filler-Gap Dependencies

Add code
Mar 02, 2026
Viaarxiv icon

More Data, Fewer Diacritics: Scaling Arabic TTS

Add code
Mar 02, 2026
Viaarxiv icon