Picture for Yonghui Wu

Yonghui Wu

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Add code
Apr 13, 2021
Figure 1 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 2 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 3 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Figure 4 for Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Viaarxiv icon

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS

Add code
Apr 02, 2021
Figure 1 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 2 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 3 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Figure 4 for PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Viaarxiv icon

Improving Longer-range Dialogue State Tracking

Add code
Feb 27, 2021
Figure 1 for Improving Longer-range Dialogue State Tracking
Figure 2 for Improving Longer-range Dialogue State Tracking
Figure 3 for Improving Longer-range Dialogue State Tracking
Figure 4 for Improving Longer-range Dialogue State Tracking
Viaarxiv icon

Distilling Interpretable Models into Human-Readable Code

Add code
Feb 09, 2021
Figure 1 for Distilling Interpretable Models into Human-Readable Code
Figure 2 for Distilling Interpretable Models into Human-Readable Code
Figure 3 for Distilling Interpretable Models into Human-Readable Code
Figure 4 for Distilling Interpretable Models into Human-Readable Code
Viaarxiv icon

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization

Add code
Oct 21, 2020
Figure 1 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 2 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 3 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 4 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Viaarxiv icon

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

Add code
Oct 20, 2020
Figure 1 for Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling

Add code
Oct 12, 2020
Figure 1 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 2 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 3 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 4 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Viaarxiv icon

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

Add code
Oct 08, 2020
Figure 1 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 2 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 3 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Figure 4 for Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Viaarxiv icon

Improved Noisy Student Training for Automatic Speech Recognition

Add code
May 19, 2020
Figure 1 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 2 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 3 for Improved Noisy Student Training for Automatic Speech Recognition
Figure 4 for Improved Noisy Student Training for Automatic Speech Recognition
Viaarxiv icon

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

Add code
May 17, 2020
Figure 1 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 2 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 3 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 4 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Viaarxiv icon