Alert button
Picture for Ron J. Weiss

Ron J. Weiss

Alert button

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Add code
Bookmark button
Alert button
Jul 09, 2019
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 2 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 3 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 4 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Viaarxiv icon

Direct speech-to-speech translation with a sequence-to-sequence model

Add code
Bookmark button
Alert button
Apr 12, 2019
Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu

Figure 1 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 2 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 3 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 4 for Direct speech-to-speech translation with a sequence-to-sequence model
Viaarxiv icon

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Add code
Bookmark button
Alert button
Feb 21, 2019
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Figure 1 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 2 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 3 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Viaarxiv icon

A spelling correction model for end-to-end speech recognition

Add code
Bookmark button
Alert button
Feb 19, 2019
Jinxi Guo, Tara N. Sainath, Ron J. Weiss

Figure 1 for A spelling correction model for end-to-end speech recognition
Figure 2 for A spelling correction model for end-to-end speech recognition
Figure 3 for A spelling correction model for end-to-end speech recognition
Figure 4 for A spelling correction model for end-to-end speech recognition
Viaarxiv icon

Unsupervised speech representation learning using WaveNet autoencoders

Add code
Bookmark button
Alert button
Jan 25, 2019
Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron van den Oord

Figure 1 for Unsupervised speech representation learning using WaveNet autoencoders
Figure 2 for Unsupervised speech representation learning using WaveNet autoencoders
Figure 3 for Unsupervised speech representation learning using WaveNet autoencoders
Figure 4 for Unsupervised speech representation learning using WaveNet autoencoders
Viaarxiv icon

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

Add code
Bookmark button
Alert button
Nov 05, 2018
Ye Jia, Melvin Johnson, Wolfgang Macherey, Ron J. Weiss, Yuan Cao, Chung-Cheng Chiu, Naveen Ari, Stella Laurenzo, Yonghui Wu

Figure 1 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 2 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 3 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 4 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Viaarxiv icon

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

Add code
Bookmark button
Alert button
Nov 05, 2018
Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu

Figure 1 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 2 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 3 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 4 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Viaarxiv icon

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

Add code
Bookmark button
Alert button
Oct 27, 2018
Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio Lopez Moreno

Figure 1 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 2 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 3 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 4 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Viaarxiv icon

Hierarchical Generative Modeling for Controllable Speech Synthesis

Add code
Bookmark button
Alert button
Oct 16, 2018
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

Figure 1 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 2 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 3 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 4 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Viaarxiv icon

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

Add code
Bookmark button
Alert button
Mar 24, 2018
RJ Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous

Figure 1 for Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
Figure 2 for Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
Figure 3 for Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
Figure 4 for Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
Viaarxiv icon