RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

May 17, 2020
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Model/Code and Paper
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Mar 28, 2020
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

* In Proceedings of IEEE ICASSP 2020 

  Access Model/Code and Paper
Deliberation Model Based Two-Pass End-to-End Speech Recognition

Mar 17, 2020
Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar


  Access Model/Code and Paper
A comparison of end-to-end models for long-form speech recognition

Nov 06, 2019
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

* ASRU camera-ready version 

  Access Model/Code and Paper
Recognizing long-form speech using streaming end-to-end models

Oct 24, 2019
Arun Narayanan, Rohit Prabhavalkar, Chung-Cheng Chiu, David Rybach, Tara N. Sainath, Trevor Strohman


  Access Model/Code and Paper
Two-Pass End-to-End Speech Recognition

Aug 29, 2019
Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu


  Access Model/Code and Paper
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models

Jul 22, 2019
Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak


  Access Model/Code and Paper
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Feb 21, 2019
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon


  Access Model/Code and Paper
Model Unit Exploration for Sequence-to-Sequence Speech Recognition

Feb 05, 2019
Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan, Antoine Bruguier, David Rybach, Patrick Nguyen

* 5 pages, 1 figure 

  Access Model/Code and Paper
Streaming End-to-end Speech Recognition For Mobile Devices

Nov 15, 2018
Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-yiin Chang, Kanishka Rao, Alexander Gruenstein


  Access Model/Code and Paper
Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition

Oct 31, 2018
Chris Donahue, Bo Li, Rohit Prabhavalkar

* Published as a conference paper at ICASSP 2018 

  Access Model/Code and Paper
From Audio to Semantics: Approaches to end-to-end spoken language understanding

Sep 24, 2018
Parisa Haghani, Arun Narayanan, Michiel Bacchiani, Galen Chuang, Neeraj Gaur, Pedro Moreno, Rohit Prabhavalkar, Zhongdi Qu, Austin Waters


  Access Model/Code and Paper
Deep context: end-to-end contextual speech recognition

Aug 07, 2018
Golan Pundak, Tara N. Sainath, Rohit Prabhavalkar, Anjuli Kannan, Ding Zhao


  Access Model/Code and Paper
State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Feb 23, 2018
Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, Michiel Bacchiani

* ICASSP camera-ready version 

  Access Model/Code and Paper
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer

Jan 02, 2018
Kanishka Rao, Haşim Sak, Rohit Prabhavalkar

* In Proceedings of IEEE ASRU 2017 

  Access Model/Code and Paper
An analysis of incorporating an external language model into a sequence-to-sequence model

Dec 06, 2017
Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Tara N. Sainath, Zhifeng Chen, Rohit Prabhavalkar


  Access Model/Code and Paper
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models

Dec 05, 2017
Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu


  Access Model/Code and Paper
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models

Dec 05, 2017
Rohit Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Kannan


  Access Model/Code and Paper
Improving the Performance of Online Neural Transducer Models

Dec 05, 2017
Tara N. Sainath, Chung-Cheng Chiu, Rohit Prabhavalkar, Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Zhifeng Chen


  Access Model/Code and Paper
Streaming Small-Footprint Keyword Spotting using Sequence-to-Sequence Models

Oct 26, 2017
Yanzhang He, Rohit Prabhavalkar, Kanishka Rao, Wei Li, Anton Bakhtin, Ian McGraw

* To appear in Proceedings of IEEE ASRU 2017 

  Access Model/Code and Paper
On the efficient representation and execution of deep acoustic models

Dec 17, 2016
Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin

* Accepted conference paper: "The Annual Conference of the International Speech Communication Association (Interspeech), 2016" 

  Access Model/Code and Paper
On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition

May 02, 2016
Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, Ian McGraw

* Accepted in ICASSP 2016 

  Access Model/Code and Paper
Personalized Speech recognition on mobile devices

Mar 11, 2016
Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kanishka Rao, David Rybach, Ouais Alsharif, Hasim Sak, Alexander Gruenstein, Francoise Beaufays, Carolina Parada


  Access Model/Code and Paper