speech


From Flat to Feeling: A Feasibility and Impact Study on Dynamic Facial Emotions in AI-Generated Avatars

Add code
Jun 16, 2025
Viaarxiv icon

CMU's IWSLT 2025 Simultaneous Speech Translation System

Add code
Jun 16, 2025
Viaarxiv icon

Instance-Specific Test-Time Training for Speech Editing in the Wild

Add code
Jun 16, 2025
Viaarxiv icon

SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition

Add code
Jun 15, 2025
Viaarxiv icon

Magnetoencephalography (MEG) Based Non-Invasive Chinese Speech Decoding

Add code
Jun 15, 2025
Viaarxiv icon

Rethinking Hate Speech Detection on Social Media: Can LLMs Replace Traditional Models?

Add code
Jun 15, 2025
Viaarxiv icon

Towards Neural Audio Codec Source Parsing

Add code
Jun 14, 2025
Viaarxiv icon

An Exploration of Mamba for Speech Self-Supervised Models

Add code
Jun 14, 2025
Viaarxiv icon

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling

Add code
Jun 14, 2025
Viaarxiv icon

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding

Add code
Jun 14, 2025
Viaarxiv icon