Speech


WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling

Add code
May 07, 2026
Viaarxiv icon

Automated Clinical Report Generation for Remote Cognitive Remediation: Comparing Knowledge-Engineered Templates and LLMs in Low-Resource Settings

Add code
May 07, 2026
Viaarxiv icon

PairAlign: A Framework for Sequence Tokenization via Self-Alignment with Applications to Audio Tokenization

Add code
May 07, 2026
Viaarxiv icon

COVID-19 Infodemic. Understanding content features in detecting fake news using a machine learning approach

Add code
May 07, 2026
Viaarxiv icon

Linear Semantic Segmentation for Low-Resource Spoken Dialects

Add code
May 07, 2026
Viaarxiv icon

Predictive-Generative Drift Decomposition for Speech Enhancement and Separation

Add code
May 07, 2026
Viaarxiv icon

PersonaGesture: Single-Reference Co-Speech Gesture Personalization for Unseen Speakers

Add code
May 07, 2026
Viaarxiv icon

PersonaKit (PK): A Plug-and-Play Platform for User Testing Diverse Roles in Full-Duplex Dialogue

Add code
May 07, 2026
Viaarxiv icon

DiBA: Diagonal and Binary Matrix Approximation for Neural Network Weight Compression

Add code
May 07, 2026
Viaarxiv icon

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Add code
May 07, 2026
Viaarxiv icon