Picture for Jingbei Li

Jingbei Li

DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin

Add code
Sep 02, 2023
Figure 1 for DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Figure 2 for DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Figure 3 for DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Figure 4 for DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Viaarxiv icon

Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing

Add code
May 09, 2023
Figure 1 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 2 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 3 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 4 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Viaarxiv icon

NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism

Add code
Mar 31, 2022
Figure 1 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 2 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 3 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 4 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Viaarxiv icon

Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis

Add code
Jun 11, 2021
Figure 1 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 2 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 3 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 4 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Viaarxiv icon

Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech

Add code
Apr 20, 2021
Figure 1 for Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Figure 2 for Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Figure 3 for Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Figure 4 for Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Viaarxiv icon

Towards Multi-Scale Style Control for Expressive Speech Synthesis

Add code
Apr 08, 2021
Figure 1 for Towards Multi-Scale Style Control for Expressive Speech Synthesis
Figure 2 for Towards Multi-Scale Style Control for Expressive Speech Synthesis
Figure 3 for Towards Multi-Scale Style Control for Expressive Speech Synthesis
Figure 4 for Towards Multi-Scale Style Control for Expressive Speech Synthesis
Viaarxiv icon

Adversarially learning disentangled speech representations for robust multi-factor voice conversion

Add code
Jan 30, 2021
Figure 1 for Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Figure 2 for Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Figure 3 for Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Figure 4 for Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Viaarxiv icon

Syntactic representation learning for neural network based TTS with syntactic parse tree traversal

Add code
Dec 13, 2020
Figure 1 for Syntactic representation learning for neural network based TTS with syntactic parse tree traversal
Figure 2 for Syntactic representation learning for neural network based TTS with syntactic parse tree traversal
Figure 3 for Syntactic representation learning for neural network based TTS with syntactic parse tree traversal
Figure 4 for Syntactic representation learning for neural network based TTS with syntactic parse tree traversal
Viaarxiv icon