Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Ruoming Pang

Vector-quantized Image Modeling with Improved VQGAN


Oct 09, 2021
Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

* Preprint 

  Access Paper or Ask Questions

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition


Oct 01, 2021
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

* 14 pages, 7 figures, 13 tables; v2: minor corrections, reference baselines and bibliography updated 

  Access Paper or Ask Questions

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training


Aug 07, 2021
Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu


  Access Paper or Ask Questions

GSPMD: General and Scalable Parallelization for ML Computation Graphs


May 10, 2021
Yuanzhong Xu, HyoukJoong Lee, Dehao Chen, Blake Hechtman, Yanping Huang, Rahul Joshi, Maxim Krikun, Dmitry Lepikhin, Andy Ly, Marcello Maggioni, Ruoming Pang, Noam Shazeer, Shibo Wang, Tao Wang, Yonghui Wu, Zhifeng Chen


  Access Paper or Ask Questions

Scaling End-to-End Models for Large-Scale Multilingual ASR


Apr 30, 2021
Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models


Apr 25, 2021
Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao


  Access Paper or Ask Questions

Searching for Fast Model Families on Datacenter Accelerators


Feb 10, 2021
Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi


  Access Paper or Ask Questions

Transformer Based Deliberation for Two-Pass Speech Recognition


Jan 27, 2021
Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman


  Access Paper or Ask Questions

Cascaded encoders for unifying streaming and non-streaming ASR


Oct 27, 2020
Arun Narayanan, Tara N. Sainath, Ruoming Pang, Jiahui Yu, Chung-Cheng Chiu, Rohit Prabhavalkar, Ehsan Variani, Trevor Strohman


  Access Paper or Ask Questions

Unsupervised Learning of Disentangled Speech Content and Style Representation


Oct 24, 2020
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita

* Submitted to ICASSP 2021 

  Access Paper or Ask Questions

Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data


Oct 22, 2020
Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao


  Access Paper or Ask Questions

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization


Oct 21, 2020
Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition


Oct 20, 2020
Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

* 11 pages, 3 figures, 5 tables. Submitted to NeurIPS SAS 2020 Workshop 

  Access Paper or Ask Questions

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling


Oct 12, 2020
Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang

* tech report 

  Access Paper or Ask Questions

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition


Sep 02, 2020
Wei Li, James Qin, Chung-Cheng Chiu, Ruoming Pang, Yanzhang He

* Proceedings of Interspeech, 2020 

  Access Paper or Ask Questions

Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus


Aug 25, 2020
Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar


  Access Paper or Ask Questions

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions


May 17, 2020
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

Dynamic Sparsity Neural Networks for Automatic Speech Recognition


May 16, 2020
Zhaofeng Wu, Ding Zhao, Qiao Liang, Jiahui Yu, Anmol Gulati, Ruoming Pang

* Submitted to INTERSPEECH 2020 

  Access Paper or Ask Questions

Conformer: Convolution-augmented Transformer for Speech Recognition


May 16, 2020
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context


May 16, 2020
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

* Submitted to Interspeech 2020 

  Access Paper or Ask Questions

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency


Mar 28, 2020
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

* In Proceedings of IEEE ICASSP 2020 

  Access Paper or Ask Questions

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models


Mar 24, 2020
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le

* Technical report 

  Access Paper or Ask Questions

Deliberation Model Based Two-Pass End-to-End Speech Recognition


Mar 17, 2020
Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar


  Access Paper or Ask Questions

EfficientDet: Scalable and Efficient Object Detection


Nov 20, 2019
Mingxing Tan, Ruoming Pang, Quoc V. Le


  Access Paper or Ask Questions

A comparison of end-to-end models for long-form speech recognition


Nov 06, 2019
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

* ASRU camera-ready version 

  Access Paper or Ask Questions

Two-Pass End-to-End Speech Recognition


Aug 29, 2019
Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu


  Access Paper or Ask Questions