Alert button
Picture for Heiga Zen

Heiga Zen

Alert button

MAESTRO: Matched Speech Text Representations through Modality Matching

Apr 07, 2022
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Ankur Bapna, Heiga Zen

Figure 1 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 2 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 3 for MAESTRO: Matched Speech Text Representations through Modality Matching
Figure 4 for MAESTRO: Matched Speech Text Representations through Modality Matching
Viaarxiv icon

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

Mar 31, 2022
Yuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel Bacchiani

Figure 1 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 2 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 3 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 4 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Viaarxiv icon

CVSS Corpus and Massively Multilingual Speech-to-Speech Translation

Jan 16, 2022
Ye Jia, Michelle Tadmor Ramanovich, Quan Wang, Heiga Zen

Figure 1 for CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Figure 2 for CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Figure 3 for CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Figure 4 for CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Viaarxiv icon

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Jun 19, 2021
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

Figure 1 for WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Figure 2 for WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Figure 3 for WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Figure 4 for WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Viaarxiv icon

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Apr 13, 2021
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, RJ Skerry-Ryan, Yonghui Wu

Figure 1 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 2 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 3 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 4 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Viaarxiv icon

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS

Apr 02, 2021
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu

Figure 1 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 2 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 3 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 4 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Viaarxiv icon