speech


Joint Enhancement and Classification using Coupled Diffusion Models of Signals and Logits

Add code
Feb 17, 2026
Viaarxiv icon

What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model

Add code
Feb 17, 2026
Viaarxiv icon

Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations

Add code
Feb 17, 2026
Viaarxiv icon

Under-resourced studies of under-resourced languages: lemmatization and POS-tagging with LLM annotators for historical Armenian, Georgian, Greek and Syriac

Add code
Feb 17, 2026
Viaarxiv icon

ZeroSyl: Simple Zero-Resource Syllable Tokenization for Spoken Language Modeling

Add code
Feb 17, 2026
Viaarxiv icon

MAEB: Massive Audio Embedding Benchmark

Add code
Feb 17, 2026
Viaarxiv icon

Disentangling Pitch and Creak for Speaker Identity Preservation in Speech Synthesis

Add code
Feb 16, 2026
Viaarxiv icon

Data Augmentation for Pathological Speech Enhancement

Add code
Feb 16, 2026
Viaarxiv icon

Breaking Data Efficiency Dilemma: A Federated and Augmented Learning Framework For Alzheimer's Disease Detection via Speech

Add code
Feb 16, 2026
Viaarxiv icon

SA-SSL-MOS: Self-supervised Learning MOS Prediction with Spectral Augmentation for Generalized Multi-Rate Speech Assessment

Add code
Feb 16, 2026
Viaarxiv icon