speech


LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target

Add code
Oct 02, 2025
Viaarxiv icon

SLAP: Learning Speaker and Health-Related Representations from Natural Language Supervision

Add code
Oct 02, 2025
Viaarxiv icon

High-Fidelity Speech Enhancement via Discrete Audio Tokens

Add code
Oct 02, 2025
Viaarxiv icon

MelCap: A Unified Single-Codebook Neural Codec for High-Fidelity Audio Compression

Add code
Oct 02, 2025
Viaarxiv icon

Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enhancement

Add code
Oct 02, 2025
Viaarxiv icon

Enhancing Noise Robustness of Parkinson's Disease Telemonitoring via Contrastive Feature Augmentation

Add code
Oct 02, 2025
Viaarxiv icon

Tenyidie Syllabification corpus creation and deep learning applications

Add code
Oct 02, 2025
Viaarxiv icon

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Add code
Oct 02, 2025
Figure 1 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 2 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 3 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 4 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Viaarxiv icon

Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage

Add code
Oct 02, 2025
Figure 1 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 2 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 3 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 4 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Viaarxiv icon

EuroSpeech: A Multilingual Speech Corpus

Add code
Oct 01, 2025
Viaarxiv icon