Alert button
Picture for Frank Soong

Frank Soong

Alert button

Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives

Jul 06, 2022
Bin Su, Shaoguang Mao, Frank Soong, Zhiyong Wu

Figure 1 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 2 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 3 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 4 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Viaarxiv icon

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

May 10, 2022
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 2 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 3 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 4 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Viaarxiv icon

An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings

Oct 14, 2021
Wenxuan Ye, Shaoguang Mao, Frank Soong, Wenshan Wu, Yan Xia, Jonathan Tien, Zhiyong Wu

Figure 1 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 2 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 3 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 4 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Viaarxiv icon

A Survey on Neural Speech Synthesis

Jul 23, 2021
Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu

Figure 1 for A Survey on Neural Speech Synthesis
Figure 2 for A Survey on Neural Speech Synthesis
Figure 3 for A Survey on Neural Speech Synthesis
Figure 4 for A Survey on Neural Speech Synthesis
Viaarxiv icon

MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network

Feb 27, 2021
Yichong Leng, Xu Tan, Sheng Zhao, Frank Soong, Xiang-Yang Li, Tao Qin

Figure 1 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 2 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 3 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 4 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Viaarxiv icon

Improving pronunciation assessment via ordinal regression with anchored reference samples

Oct 26, 2020
Bin Su, Shaoguang Mao, Frank Soong, Yan Xia, Jonathan Tien, Zhiyong Wu

Figure 1 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Figure 2 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Figure 3 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Viaarxiv icon

Feature reinforcement with word embedding and parsing information in neural TTS

Jan 03, 2019
Huaiping Ming, Lei He, Haohan Guo, Frank Soong

Figure 1 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 2 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 3 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 4 for Feature reinforcement with word embedding and parsing information in neural TTS
Viaarxiv icon

Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice

Dec 18, 2018
Yan Deng, Lei He, Frank Soong

Figure 1 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 2 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 3 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 4 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Viaarxiv icon