Alert button
Picture for Yonghui Wu

Yonghui Wu

Alert button

Direct speech-to-speech translation with a sequence-to-sequence model

Add code
Bookmark button
Alert button
Apr 12, 2019
Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu

Figure 1 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 2 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 3 for Direct speech-to-speech translation with a sequence-to-sequence model
Figure 4 for Direct speech-to-speech translation with a sequence-to-sequence model
Viaarxiv icon

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Add code
Bookmark button
Alert button
Feb 21, 2019
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Figure 1 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 2 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 3 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Viaarxiv icon

Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes

Add code
Bookmark button
Alert button
Nov 22, 2018
Bo Li, Yu Zhang, Tara Sainath, Yonghui Wu, William Chan

Figure 1 for Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Figure 2 for Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Figure 3 for Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Figure 4 for Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Viaarxiv icon

Streaming End-to-end Speech Recognition For Mobile Devices

Add code
Bookmark button
Alert button
Nov 15, 2018
Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-yiin Chang, Kanishka Rao, Alexander Gruenstein

Figure 1 for Streaming End-to-end Speech Recognition For Mobile Devices
Figure 2 for Streaming End-to-end Speech Recognition For Mobile Devices
Figure 3 for Streaming End-to-end Speech Recognition For Mobile Devices
Figure 4 for Streaming End-to-end Speech Recognition For Mobile Devices
Viaarxiv icon

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

Add code
Bookmark button
Alert button
Nov 05, 2018
Ye Jia, Melvin Johnson, Wolfgang Macherey, Ron J. Weiss, Yuan Cao, Chung-Cheng Chiu, Naveen Ari, Stella Laurenzo, Yonghui Wu

Figure 1 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 2 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 3 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Figure 4 for Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Viaarxiv icon

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

Add code
Bookmark button
Alert button
Nov 05, 2018
Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu

Figure 1 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 2 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 3 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Figure 4 for Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Viaarxiv icon

Hierarchical Generative Modeling for Controllable Speech Synthesis

Add code
Bookmark button
Alert button
Oct 16, 2018
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

Figure 1 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 2 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 3 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 4 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Viaarxiv icon

Training Deeper Neural Machine Translation Models with Transparent Attention

Add code
Bookmark button
Alert button
Sep 04, 2018
Ankur Bapna, Mia Xu Chen, Orhan Firat, Yuan Cao, Yonghui Wu

Figure 1 for Training Deeper Neural Machine Translation Models with Transparent Attention
Figure 2 for Training Deeper Neural Machine Translation Models with Transparent Attention
Figure 3 for Training Deeper Neural Machine Translation Models with Transparent Attention
Figure 4 for Training Deeper Neural Machine Translation Models with Transparent Attention
Viaarxiv icon

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition

Add code
Bookmark button
Alert button
Jul 27, 2018
Shubham Toshniwal, Anjuli Kannan, Chung-Cheng Chiu, Yonghui Wu, Tara N Sainath, Karen Livescu

Figure 1 for A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Figure 2 for A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Figure 3 for A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Figure 4 for A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Viaarxiv icon

Speech recognition for medical conversations

Add code
Bookmark button
Alert button
Jun 20, 2018
Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang

Figure 1 for Speech recognition for medical conversations
Figure 2 for Speech recognition for medical conversations
Figure 3 for Speech recognition for medical conversations
Viaarxiv icon