Speaking Style Synthesis


VoiceSculptor: Your Voice, Designed By You

Add code
Jan 15, 2026
Viaarxiv icon

ReStyle-TTS: Relative and Continuous Style Control for Zero-Shot Speech Synthesis

Add code
Jan 07, 2026
Viaarxiv icon

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

Add code
Jul 28, 2025
Viaarxiv icon

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Add code
Jul 02, 2025
Viaarxiv icon

GSA-TTS : Toward Zero-Shot Speech Synthesis based on Gradual Style Adaptor

Add code
May 26, 2025
Viaarxiv icon

LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting

Add code
May 29, 2025
Figure 1 for LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
Figure 2 for LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
Figure 3 for LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
Figure 4 for LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
Viaarxiv icon

Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Add code
Mar 03, 2025
Figure 1 for Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Figure 2 for Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Figure 3 for Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Figure 4 for Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Viaarxiv icon

NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head Synthesis

Add code
Feb 20, 2025
Viaarxiv icon

Generative Expressive Conversational Speech Synthesis

Add code
Aug 01, 2024
Figure 1 for Generative Expressive Conversational Speech Synthesis
Figure 2 for Generative Expressive Conversational Speech Synthesis
Figure 3 for Generative Expressive Conversational Speech Synthesis
Figure 4 for Generative Expressive Conversational Speech Synthesis
Viaarxiv icon

Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model

Add code
May 16, 2024
Figure 1 for Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model
Figure 2 for Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model
Figure 3 for Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model
Figure 4 for Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model
Viaarxiv icon