Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Unified Mandarin TTS Front-end Based on Distilled BERT Model

Dec 31, 2020

Yang Zhang, Liqun Deng, Yasheng Wang

Figure 1 for Unified Mandarin TTS Front-end Based on Distilled BERT Model

Figure 2 for Unified Mandarin TTS Front-end Based on Distilled BERT Model

Figure 3 for Unified Mandarin TTS Front-end Based on Distilled BERT Model

Figure 4 for Unified Mandarin TTS Front-end Based on Distilled BERT Model

Share this with someone who'll enjoy it:

Abstract:The front-end module in a typical Mandarin text-to-speech system (TTS) is composed of a long pipeline of text processing components, which requires extensive efforts to build and is prone to large accumulative model size and cascade errors. In this paper, a pre-trained language model (PLM) based model is proposed to simultaneously tackle the two most important tasks in TTS front-end, i.e., prosodic structure prediction (PSP) and grapheme-to-phoneme (G2P) conversion. We use a pre-trained Chinese BERT[1] as the text encoder and employ multi-task learning technique to adapt it to the two TTS front-end tasks. Then, the BERT encoder is distilled into a smaller model by employing a knowledge distillation technique called TinyBERT[2], making the whole model size 25% of that of benchmark pipeline models while maintaining competitive performance on both tasks. With the proposed the methods, we are able to run the whole TTS front-end module in a light and unified manner, which is more friendly to deployment on mobile devices.

* 5 pages

View paper on

Share this with someone who'll enjoy it:

Title:Unified Mandarin TTS Front-end Based on Distilled BERT Model

Paper and Code