speech


TokenSE: a Mamba-based discrete token speech enhancement framework for cochlear implants

Add code
Apr 14, 2026
Viaarxiv icon

PS-TTS: Phonetic Synchronization in Text-to-Speech for Achieving Natural Automated Dubbing

Add code
Apr 14, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

Add code
Apr 14, 2026
Viaarxiv icon

The Enforcement and Feasibility of Hate Speech Moderation on Twitter

Add code
Apr 14, 2026
Viaarxiv icon

VoxEffects: A Speech-Oriented Audio Effects Dataset and Benchmark

Add code
Apr 14, 2026
Viaarxiv icon

When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP

Add code
Apr 14, 2026
Viaarxiv icon

MoshiRAG: Asynchronous Knowledge Retrieval for Full-Duplex Speech Language Models

Add code
Apr 14, 2026
Viaarxiv icon

Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI

Add code
Apr 14, 2026
Viaarxiv icon

SEDTalker: Emotion-Aware 3D Facial Animation Using Frame-Level Speech Emotion Diarization

Add code
Apr 14, 2026
Viaarxiv icon