Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization

Oct 21, 2020
Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

Oct 20, 2020
Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

* 11 pages, 3 figures, 5 tables. Submitted to NeurIPS SAS 2020 Workshop 

  Access Paper or Ask Questions

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling

Oct 12, 2020
Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

Oct 08, 2020
Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu

* Under review as a conference paper at ICLR 2021 

  Access Paper or Ask Questions

Improved Noisy Student Training for Automatic Speech Recognition

May 19, 2020
Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le

* 5 pages, 5 figures, 4 tables. Submitted to Interspeech 2020 

  Access Paper or Ask Questions

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

May 17, 2020
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

Conformer: Convolution-augmented Transformer for Speech Recognition

May 16, 2020
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

May 16, 2020
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

Interpretable Learning-to-Rank with Generalized Additive Models

May 14, 2020
Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, Hai Qian

* 10 pages 

  Access Paper or Ask Questions

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

May 11, 2020
Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

* ACL 2020 

  Access Paper or Ask Questions

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Mar 28, 2020
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

* In Proceedings of IEEE ICASSP 2020 

  Access Paper or Ask Questions

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu

* to appear in ICASSP 2020 

  Access Paper or Ask Questions

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu

* To appear in ICASSP 2020 

  Access Paper or Ask Questions

SpecAugment on Large Scale Datasets

Dec 11, 2019
Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu

* 5 pages, 3 tables; submitted to ICASSP 2020 

  Access Paper or Ask Questions

A comparison of end-to-end models for long-form speech recognition

Nov 06, 2019
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

* ASRU camera-ready version 

  Access Paper or Ask Questions

Identifying Cancer Patients at Risk for Heart Failure Using Machine Learning Methods

Oct 01, 2019
Xi Yang, Yan Gong, Nida Waheed, Keith March, Jiang Bian, William R. Hogan, Yonghui Wu

* 6 pages, 1 figure, 3 tables, accepted by AMIA 2019 

  Access Paper or Ask Questions

Speech Recognition with Augmented Synthesized Speech

Sep 25, 2019
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu

* Accepted for publication at ASRU 2020 

  Access Paper or Ask Questions

Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model

Sep 11, 2019
Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee

* Accepted in Interspeech 2019 

  Access Paper or Ask Questions

Two-Pass End-to-End Speech Recognition

Aug 29, 2019
Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu


  Access Paper or Ask Questions

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Jul 24, 2019
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

* 5 pages, submitted to Interspeech 2019 

  Access Paper or Ask Questions

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

Jul 11, 2019
Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, Yonghui Wu


  Access Paper or Ask Questions

Gmail Smart Compose: Real-Time Assisted Writing

May 17, 2019
Mia Xu Chen, Benjamin N Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin Lu, Jackie Tsay, Yinan Wang, Andrew M. Dai, Zhifeng Chen, Timothy Sohn, Yonghui Wu


  Access Paper or Ask Questions

Direct speech-to-speech translation with a sequence-to-sequence model

Apr 12, 2019
Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu

* Submitted to Interspeech 2019 

  Access Paper or Ask Questions

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Feb 21, 2019
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon


  Access Paper or Ask Questions

Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes

Nov 22, 2018
Bo Li, Yu Zhang, Tara Sainath, Yonghui Wu, William Chan

* submitted to ICASSP 2019 

  Access Paper or Ask Questions