Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shyamal Kumar Das Mandal

F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

Feb 18, 2015

Sankar Mukherjee, Shyamal Kumar Das Mandal

Figure 1 for F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

Figure 2 for F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

Figure 3 for F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

Figure 4 for F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

Abstract:In recent years multilayer perceptrons (MLPs) with many hid- den layers Deep Neural Network (DNN) has performed sur- prisingly well in many speech tasks, i.e. speech recognition, speaker verification, speech synthesis etc. Although in the context of F0 modeling these techniques has not been ex- ploited properly. In this paper, Deep Belief Network (DBN), a class of DNN family has been employed and applied to model the F0 contour of synthesized speech which was generated by HMM-based speech synthesis system. The experiment was done on Bengali language. Several DBN-DNN architectures ranging from four to seven hidden layers and up to 200 hid- den units per hidden layer was presented and evaluated. The results were compared against clustering tree techniques pop- ularly found in statistical parametric speech synthesis. We show that from textual inputs DBN-DNN learns a high level structure which in turn improves F0 contour in terms of ob- jective and subjective tests.

* OCOCOSDA 2014

Via

Access Paper or Ask Questions

A Bengali HMM Based Speech Synthesis System

Jun 16, 2014

Sankar Mukherjee, Shyamal Kumar Das Mandal

Figure 1 for A Bengali HMM Based Speech Synthesis System

Figure 2 for A Bengali HMM Based Speech Synthesis System

Figure 3 for A Bengali HMM Based Speech Synthesis System

Figure 4 for A Bengali HMM Based Speech Synthesis System

Abstract:The paper presents the capability of an HMM-based TTS system to produce Bengali speech. In this synthesis method, trajectories of speech parameters are generated from the trained Hidden Markov Models. A final speech waveform is synthesized from those speech parameters. In our experiments, spectral properties were represented by Mel Cepstrum Coefficients. Both the training and synthesis issues are investigated in this paper using annotated Bengali speech database. Experimental evaluation depicts that the developed text-to-speech system is capable of producing adequately natural speech in terms of intelligibility and intonation for Bengali.

* Oriental COCOSDA 2012, pp.225 259

Via

Access Paper or Ask Questions