speech


Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework

Add code
Mar 24, 2026
Viaarxiv icon

RelayS2S: A Dual-Path Speculative Generation for Real-Time Dialogue

Add code
Mar 24, 2026
Viaarxiv icon

Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Add code
Mar 24, 2026
Viaarxiv icon

MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates

Add code
Mar 24, 2026
Viaarxiv icon

Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation

Add code
Mar 24, 2026
Viaarxiv icon

When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse

Add code
Mar 24, 2026
Viaarxiv icon

Ara-Best-RQ: Multi Dialectal Arabic SSL

Add code
Mar 23, 2026
Viaarxiv icon

MSP-Conversation: A Corpus for Naturalistic, Time-Continuous Emotion Recognition

Add code
Mar 23, 2026
Viaarxiv icon

Politics of Questions in News: A Mixed-Methods Study of Interrogative Stances as Markers of Voice and Power

Add code
Mar 23, 2026
Viaarxiv icon

LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Add code
Mar 23, 2026
Viaarxiv icon