Alert button
Picture for Sheng Zhao

Sheng Zhao

Alert button

PromptTTS: Controllable Text-to-Speech with Text Descriptions

Add code
Bookmark button
Alert button
Nov 22, 2022
Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan

Figure 1 for PromptTTS: Controllable Text-to-Speech with Text Descriptions
Figure 2 for PromptTTS: Controllable Text-to-Speech with Text Descriptions
Figure 3 for PromptTTS: Controllable Text-to-Speech with Text Descriptions
Figure 4 for PromptTTS: Controllable Text-to-Speech with Text Descriptions
Viaarxiv icon

MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks

Add code
Bookmark button
Alert button
Aug 30, 2022
Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks
Figure 2 for MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks
Figure 3 for MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks
Figure 4 for MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks
Viaarxiv icon

StableFace: Analyzing and Improving Motion Stability for Talking Face Generation

Add code
Bookmark button
Alert button
Aug 29, 2022
Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song

Figure 1 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 2 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 3 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 4 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Viaarxiv icon

DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders

Add code
Bookmark button
Alert button
Jul 11, 2022
Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao

Figure 1 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 2 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 3 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 4 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Viaarxiv icon

RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion

Add code
Bookmark button
Alert button
Jun 28, 2022
Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo

Figure 1 for RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Figure 2 for RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Figure 3 for RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Figure 4 for RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Viaarxiv icon

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

Add code
Bookmark button
Alert button
May 30, 2022
Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 2 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 3 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 4 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Viaarxiv icon

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

Add code
Bookmark button
Alert button
May 10, 2022
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 2 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 3 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 4 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Viaarxiv icon

AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios

Add code
Bookmark button
Alert button
Apr 01, 2022
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu

Figure 1 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 2 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 3 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 4 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Viaarxiv icon

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech

Add code
Bookmark button
Alert button
Mar 31, 2022
Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao

Figure 1 for Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Figure 2 for Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Figure 3 for Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Figure 4 for Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Viaarxiv icon

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Add code
Bookmark button
Alert button
Mar 02, 2022
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil

Figure 1 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 2 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 3 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 4 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Viaarxiv icon