Alert button
Picture for Yonghui Wu

Yonghui Wu

Alert button

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

Add code
Bookmark button
Alert button
May 16, 2020
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

Figure 1 for ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Figure 2 for ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Figure 3 for ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Figure 4 for ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Viaarxiv icon

Interpretable Learning-to-Rank with Generalized Additive Models

Add code
Bookmark button
Alert button
May 14, 2020
Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, Hai Qian

Figure 1 for Interpretable Learning-to-Rank with Generalized Additive Models
Figure 2 for Interpretable Learning-to-Rank with Generalized Additive Models
Figure 3 for Interpretable Learning-to-Rank with Generalized Additive Models
Figure 4 for Interpretable Learning-to-Rank with Generalized Additive Models
Viaarxiv icon

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Add code
Bookmark button
Alert button
May 11, 2020
Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

Figure 1 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 2 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 3 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 4 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Viaarxiv icon

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

Add code
Bookmark button
Alert button
May 07, 2020
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

Figure 1 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 2 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 3 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 4 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Viaarxiv icon

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Add code
Bookmark button
Alert button
Mar 28, 2020
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

Figure 1 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 2 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 3 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 4 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Viaarxiv icon

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

Add code
Bookmark button
Alert button
Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu

Figure 1 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 2 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 3 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Figure 4 for Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Viaarxiv icon

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

Add code
Bookmark button
Alert button
Feb 06, 2020
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu

Figure 1 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 2 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 3 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Figure 4 for Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Viaarxiv icon