Picture for Yi Meng

Yi Meng

CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis

Add code
Aug 30, 2023
Figure 1 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 2 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 3 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 4 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Viaarxiv icon

Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing

Add code
May 09, 2023
Figure 1 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 2 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 3 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 4 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Viaarxiv icon

NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism

Add code
Mar 31, 2022
Figure 1 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 2 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 3 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Figure 4 for NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Viaarxiv icon

Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis

Add code
Jun 11, 2021
Figure 1 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 2 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 3 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 4 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Viaarxiv icon