Picture for Shun Lei

Shun Lei

The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge

Add code
Apr 25, 2024
Viaarxiv icon

SimCalib: Graph Neural Network Calibration based on Similarity between Nodes

Add code
Dec 19, 2023
Viaarxiv icon

AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation

Add code
Oct 11, 2023
Figure 1 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 2 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 3 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 4 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Viaarxiv icon

Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

Add code
Sep 22, 2023
Figure 1 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 2 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 3 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 4 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Viaarxiv icon

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

Add code
Aug 31, 2023
Figure 1 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 2 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 3 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 4 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Viaarxiv icon

Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis

Add code
Aug 31, 2023
Figure 1 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 2 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 3 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 4 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Viaarxiv icon

MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis

Add code
Jul 29, 2023
Figure 1 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 2 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 3 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 4 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Viaarxiv icon

GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

Add code
Apr 25, 2023
Figure 1 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 2 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 3 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 4 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Viaarxiv icon

Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis

Add code
Apr 13, 2023
Figure 1 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 2 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 3 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 4 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Viaarxiv icon

Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis

Add code
Apr 06, 2022
Figure 1 for Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 2 for Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 3 for Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 4 for Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Viaarxiv icon