Speech Synthesis


Speech synthesis is the process of generating artificial speech from text using computer algorithms.

BatonVoice: An Operationalist Framework for Enhancing Controllable Speech Synthesis with Linguistic Intelligence from LLMs

Add code
Sep 30, 2025
Viaarxiv icon

Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis

Add code
Sep 26, 2025
Viaarxiv icon

Comprehend and Talk: Text to Speech Synthesis via Dual Language Modeling

Add code
Sep 26, 2025
Viaarxiv icon

Speaker Anonymisation for Speech-based Suicide Risk Detection

Add code
Sep 26, 2025
Viaarxiv icon

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech

Add code
Sep 19, 2025
Figure 1 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 2 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 3 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 4 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Viaarxiv icon

Deep Dubbing: End-to-End Auto-Audiobook System with Text-to-Timbre and Context-Aware Instruct-TTS

Add code
Sep 19, 2025
Viaarxiv icon

MELA-TTS: Joint transformer-diffusion model with representation alignment for speech synthesis

Add code
Sep 18, 2025
Viaarxiv icon

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis

Add code
Sep 18, 2025
Viaarxiv icon

SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding

Add code
Sep 18, 2025
Viaarxiv icon

MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement

Add code
Sep 16, 2025
Viaarxiv icon