Picture for Helen Meng

Helen Meng

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

Add code
Aug 31, 2023
Viaarxiv icon

Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Add code
Aug 31, 2023
Figure 1 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 2 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 3 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 4 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Viaarxiv icon

CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis

Add code
Aug 30, 2023
Figure 1 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 2 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 3 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Figure 4 for CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Viaarxiv icon

Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

Add code
Aug 29, 2023
Figure 1 for Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Figure 2 for Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Figure 3 for Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Figure 4 for Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Viaarxiv icon

MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis

Add code
Jul 29, 2023
Viaarxiv icon

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Add code
Jul 06, 2023
Figure 1 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 2 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 3 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 4 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Viaarxiv icon

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition

Add code
Jun 27, 2023
Viaarxiv icon

AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction

Add code
Jun 25, 2023
Figure 1 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 2 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 3 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 4 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Viaarxiv icon

Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model

Add code
May 26, 2023
Viaarxiv icon

Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator

Add code
May 25, 2023
Viaarxiv icon