speech


Enhancing Noise Robustness of Parkinson's Disease Telemonitoring via Contrastive Feature Augmentation

Add code
Oct 02, 2025
Viaarxiv icon

Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement

Add code
Oct 02, 2025
Viaarxiv icon

Tenyidie Syllabification corpus creation and deep learning applications

Add code
Oct 02, 2025
Viaarxiv icon

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Add code
Oct 02, 2025
Viaarxiv icon

EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning

Add code
Oct 02, 2025
Viaarxiv icon

Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage

Add code
Oct 02, 2025
Viaarxiv icon

LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target

Add code
Oct 02, 2025
Viaarxiv icon

SLAP: Learning Speaker and Health-Related Representations from Natural Language Supervision

Add code
Oct 02, 2025
Viaarxiv icon

High-Fidelity Speech Enhancement via Discrete Audio Tokens

Add code
Oct 02, 2025
Viaarxiv icon

MelCap: A Unified Single-Codebook Neural Codec for High-Fidelity Audio Compression

Add code
Oct 02, 2025
Viaarxiv icon