speech


ProAct: A Dual-System Framework for Proactive Embodied Social Agents

Add code
Feb 15, 2026
Viaarxiv icon

Eureka-Audio: Triggering Audio Intelligence in Compact Language Models

Add code
Feb 15, 2026
Viaarxiv icon

Investigation for Relative Voice Impression Estimation

Add code
Feb 15, 2026
Viaarxiv icon

Bengali-Loop: Community Benchmarks for Long-Form Bangla ASR and Speaker Diarization

Add code
Feb 15, 2026
Viaarxiv icon

GSRM: Generative Speech Reward Model for Speech RLHF

Add code
Feb 14, 2026
Viaarxiv icon

ELEAT-SAGA: Early & Late Integration with Evading Alternating Training for Spoof-Robust Speaker Verification

Add code
Feb 14, 2026
Viaarxiv icon

voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models

Add code
Feb 14, 2026
Viaarxiv icon

Enhancing spatial hearing with cochlear implants: exploring the role of AI, multimodal interaction and perceptual training

Add code
Feb 14, 2026
Viaarxiv icon

Benchmarking Video Foundation Models for Remote Parkinson's Disease Screening

Add code
Feb 13, 2026
Viaarxiv icon

Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

Add code
Feb 13, 2026
Viaarxiv icon