speech


Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models

Add code
Jun 17, 2025
Viaarxiv icon

SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting

Add code
Jun 17, 2025
Viaarxiv icon

A Variational Framework for Improving Naturalness in Generative Spoken Language Models

Add code
Jun 17, 2025
Viaarxiv icon

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

Add code
Jun 16, 2025
Viaarxiv icon

From Flat to Feeling: A Feasibility and Impact Study on Dynamic Facial Emotions in AI-Generated Avatars

Add code
Jun 16, 2025
Viaarxiv icon

CMU's IWSLT 2025 Simultaneous Speech Translation System

Add code
Jun 16, 2025
Viaarxiv icon

Instance-Specific Test-Time Training for Speech Editing in the Wild

Add code
Jun 16, 2025
Viaarxiv icon

S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder

Add code
Jun 16, 2025
Viaarxiv icon

Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems

Add code
Jun 16, 2025
Viaarxiv icon

A Neural Model for Word Repetition

Add code
Jun 16, 2025
Viaarxiv icon