Alert button
Picture for Yongmao Zhang

Yongmao Zhang

Alert button

Accent-VITS:accent transfer for end-to-end TTS

Add code
Bookmark button
Alert button
Dec 29, 2023
Linhan Ma, Yongmao Zhang, Xinfa Zhu, Yi Lei, Ziqian Ning, Pengcheng Zhu, Lei Xie

Viaarxiv icon

PromptSpeaker: Speaker Generation Based on Text Descriptions

Add code
Bookmark button
Alert button
Oct 08, 2023
Yongmao Zhang, Guanghou Liu, Yi Lei, Yunlin Chen, Hao Yin, Lei Xie, Zhifei Li

Viaarxiv icon

METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer

Add code
Bookmark button
Alert button
Jul 29, 2023
Xinfa Zhu, Yi Lei, Tao Li, Yongmao Zhang, Hongbin Zhou, Heng Lu, Lei Xie

Figure 1 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 2 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 3 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 4 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Viaarxiv icon

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task

Add code
Bookmark button
Alert button
Jul 10, 2023
Kun Song, Yi lei, Peikun Chen, Yiqing Cao, Kun Wei, Yongmao Zhang, Lei Xie, Ning Jiang, Guoqing Zhao

Figure 1 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 2 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 3 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 4 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Viaarxiv icon

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions

Add code
Bookmark button
Alert button
Jun 01, 2023
Guanghou Liu, Yongmao Zhang, Yi Lei, Yunlin Chen, Rui Wang, Zhifei Li, Lei Xie

Figure 1 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 2 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 3 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 4 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Viaarxiv icon

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling

Add code
Bookmark button
Alert button
Nov 19, 2022
Xinfa Zhu, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie

Figure 1 for Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Figure 2 for Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Figure 3 for Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Figure 4 for Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Viaarxiv icon

VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

Add code
Bookmark button
Alert button
Nov 05, 2022
Yongmao Zhang, Heyang Xue, Hanzhao Li, Lei Xie, Tingwei Guo, Ruixiong Zhang, Caixia Gong

Figure 1 for VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
Figure 2 for VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
Figure 3 for VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
Figure 4 for VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
Viaarxiv icon

Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS

Add code
Bookmark button
Alert button
Nov 02, 2022
Kun Song, Jian Cong, Xinsheng Wang, Yongmao Zhang, Lei Xie, Ning Jiang, Haiying Wu

Figure 1 for Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Figure 2 for Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Figure 3 for Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Figure 4 for Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Viaarxiv icon

DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP

Add code
Bookmark button
Alert button
Nov 02, 2022
Kun Song, Yongmao Zhang, Yi Lei, Jian Cong, Hanzhao Li, Lei Xie, Gang He, Jinfeng Bai

Figure 1 for DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Figure 2 for DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Figure 3 for DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Figure 4 for DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Viaarxiv icon

AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents

Add code
Bookmark button
Alert button
Oct 31, 2022
Yongmao Zhang, Zhichao Wang, Peiji Yang, Hongshen Sun, Zhisheng Wang, Lei Xie

Figure 1 for AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Figure 2 for AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Figure 3 for AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Figure 4 for AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Viaarxiv icon