speech


Comparison of sEMG Encoding Accuracy Across Speech Modes Using Articulatory and Phoneme Features

Add code
Apr 20, 2026
Viaarxiv icon

Where Do Self-Supervised Speech Models Become Unfair?

Add code
Apr 20, 2026
Viaarxiv icon

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

Add code
Apr 20, 2026
Viaarxiv icon

FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs

Add code
Apr 20, 2026
Viaarxiv icon

BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources

Add code
Apr 20, 2026
Viaarxiv icon

Streaming Structured Inference with Flash-SemiCRF

Add code
Apr 20, 2026
Viaarxiv icon

MINT-Bench: A Comprehensive Multilingual Benchmark for Instruction-Following Text-to-Speech

Add code
Apr 20, 2026
Viaarxiv icon

Hard to Be Heard: Phoneme-Level ASR Analysis of Phonologically Complex, Low-Resource Endangered Languages

Add code
Apr 20, 2026
Viaarxiv icon

MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation

Add code
Apr 19, 2026
Viaarxiv icon

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

Add code
Apr 19, 2026
Viaarxiv icon