Picture for Heiga Zen

Heiga Zen

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Add code
Apr 13, 2021
Figure 1 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 2 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 3 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 4 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Viaarxiv icon

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS

Add code
Apr 02, 2021
Figure 1 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 2 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 3 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 4 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Viaarxiv icon

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

Add code
Oct 08, 2020
Figure 1 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 2 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 3 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 4 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Viaarxiv icon

WaveGrad: Estimating Gradients for Waveform Generation

Add code
Sep 02, 2020
Figure 1 for WaveGrad: Estimating Gradients for Waveform Generation
Figure 2 for WaveGrad: Estimating Gradients for Waveform Generation
Figure 3 for WaveGrad: Estimating Gradients for Waveform Generation
Figure 4 for WaveGrad: Estimating Gradients for Waveform Generation
Viaarxiv icon

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

Add code
Feb 06, 2020
Figure 1 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 2 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 3 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 4 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Viaarxiv icon

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

Add code
Feb 06, 2020
Figure 1 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 2 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 3 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 4 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Viaarxiv icon

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Add code
Jul 24, 2019
Figure 1 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 2 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 3 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 4 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Viaarxiv icon

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Add code
Feb 21, 2019
Figure 1 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 2 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 3 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Viaarxiv icon

Hierarchical Generative Modeling for Controllable Speech Synthesis

Add code
Oct 16, 2018
Figure 1 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 2 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 3 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 4 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Viaarxiv icon

Sample Efficient Adaptive Text-to-Speech

Add code
Sep 27, 2018
Figure 1 for Sample Efficient Adaptive Text-to-Speech
Figure 2 for Sample Efficient Adaptive Text-to-Speech
Figure 3 for Sample Efficient Adaptive Text-to-Speech
Figure 4 for Sample Efficient Adaptive Text-to-Speech
Viaarxiv icon