Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Yonghui Wu

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling


Apr 13, 2021
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, RJ Skerry-Ryan, Yonghui Wu

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS


Apr 02, 2021
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu

* Submitted to Interspeech 2021 

  Access Paper or Ask Questions

Improving Longer-range Dialogue State Tracking


Feb 27, 2021
Ye Zhang, Yuan Cao, Mahdis Mahdieh, Jefferey Zhao, Yonghui Wu

* 10 pages 

  Access Paper or Ask Questions

Distilling Interpretable Models into Human-Readable Code


Feb 09, 2021
Walker Ravina, Ethan Sterling, Olexiy Oryeshko, Nathan Bell, Honglei Zhuang, Xuanhui Wang, Yonghui Wu, Alexander Grushetsky

* 13 pages, Latex; Updated the introduction and preliminaries sections; Updated some figures for greater clarity and brevity; Added a new dataset to the experiments; Added a more detailed table of experiment results; Added a discussion of distillation failures to the experiments relating to the new dataset 

  Access Paper or Ask Questions

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization


Oct 21, 2020
Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition


Oct 20, 2020
Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

* 11 pages, 3 figures, 5 tables. Submitted to NeurIPS SAS 2020 Workshop 

  Access Paper or Ask Questions

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling


Oct 12, 2020
Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling


Oct 08, 2020
Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu

* Under review as a conference paper at ICLR 2021 

  Access Paper or Ask Questions

Improved Noisy Student Training for Automatic Speech Recognition


May 19, 2020
Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le

* 5 pages, 5 figures, 4 tables. Submitted to Interspeech 2020 

  Access Paper or Ask Questions

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions


May 17, 2020
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

Conformer: Convolution-augmented Transformer for Speech Recognition


May 16, 2020
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context


May 16, 2020
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

Interpretable Learning-to-Rank with Generalized Additive Models


May 14, 2020
Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, Hai Qian

* 10 pages 

  Access Paper or Ask Questions

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation


May 11, 2020
Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

* ACL 2020 

  Access Paper or Ask Questions

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency


Mar 28, 2020
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

* In Proceedings of IEEE ICASSP 2020 

  Access Paper or Ask Questions

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis


Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu

* to appear in ICASSP 2020 

  Access Paper or Ask Questions

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior


Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu

* To appear in ICASSP 2020 

  Access Paper or Ask Questions

SpecAugment on Large Scale Datasets


Dec 11, 2019
Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu

* 5 pages, 3 tables; submitted to ICASSP 2020 

  Access Paper or Ask Questions

A comparison of end-to-end models for long-form speech recognition


Nov 06, 2019
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

* ASRU camera-ready version 

  Access Paper or Ask Questions

Identifying Cancer Patients at Risk for Heart Failure Using Machine Learning Methods


Oct 01, 2019
Xi Yang, Yan Gong, Nida Waheed, Keith March, Jiang Bian, William R. Hogan, Yonghui Wu

* 6 pages, 1 figure, 3 tables, accepted by AMIA 2019 

  Access Paper or Ask Questions

Speech Recognition with Augmented Synthesized Speech


Sep 25, 2019
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu

* Accepted for publication at ASRU 2020 

  Access Paper or Ask Questions

Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model


Sep 11, 2019
Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee

* Accepted in Interspeech 2019 

  Access Paper or Ask Questions