Voice


JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems

Add code
Mar 27, 2026
Viaarxiv icon

Voxtral TTS

Add code
Mar 26, 2026
Viaarxiv icon

Back to Basics: Revisiting ASR in the Age of Voice Agents

Add code
Mar 26, 2026
Viaarxiv icon

Bridging Biological Hearing and Neuromorphic Computing: End-to-End Time-Domain Audio Signal Processing with Reservoir Computing

Add code
Mar 25, 2026
Viaarxiv icon

YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance

Add code
Mar 25, 2026
Viaarxiv icon

Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework

Add code
Mar 24, 2026
Viaarxiv icon

Vision-based Deep Learning Analysis of Unordered Biomedical Tabular Datasets via Optimal Spatial Cartography

Add code
Mar 24, 2026
Viaarxiv icon

Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research

Add code
Mar 23, 2026
Viaarxiv icon

SelfTTS: cross-speaker style transfer through explicit embedding disentanglement and self-refinement using self-augmentation

Add code
Mar 23, 2026
Viaarxiv icon

Politics of Questions in News: A Mixed-Methods Study of Interrogative Stances as Markers of Voice and Power

Add code
Mar 23, 2026
Viaarxiv icon