Picture for Sharan Narang

Sharan Narang

Jack

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Add code
May 28, 2021
Figure 1 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 2 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 3 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 4 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Viaarxiv icon

Do Transformer Modifications Transfer Across Implementations and Applications?

Add code
Feb 23, 2021
Figure 1 for Do Transformer Modifications Transfer Across Implementations and Applications?
Figure 2 for Do Transformer Modifications Transfer Across Implementations and Applications?
Figure 3 for Do Transformer Modifications Transfer Across Implementations and Applications?
Viaarxiv icon

On Task-Level Dialogue Composition of Generative Transformer Model

Add code
Oct 09, 2020
Figure 1 for On Task-Level Dialogue Composition of Generative Transformer Model
Figure 2 for On Task-Level Dialogue Composition of Generative Transformer Model
Figure 3 for On Task-Level Dialogue Composition of Generative Transformer Model
Figure 4 for On Task-Level Dialogue Composition of Generative Transformer Model
Viaarxiv icon

WT5?! Training Text-to-Text Models to Explain their Predictions

Add code
Apr 30, 2020
Figure 1 for WT5?! Training Text-to-Text Models to Explain their Predictions
Figure 2 for WT5?! Training Text-to-Text Models to Explain their Predictions
Figure 3 for WT5?! Training Text-to-Text Models to Explain their Predictions
Figure 4 for WT5?! Training Text-to-Text Models to Explain their Predictions
Viaarxiv icon

Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning

Add code
Oct 31, 2019
Figure 1 for Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning
Figure 2 for Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning
Figure 3 for Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning
Figure 4 for Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning
Viaarxiv icon

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Add code
Oct 24, 2019
Figure 1 for Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Figure 2 for Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Figure 3 for Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Figure 4 for Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Viaarxiv icon

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

Add code
Feb 22, 2018
Figure 1 for Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Figure 2 for Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Figure 3 for Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Figure 4 for Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Viaarxiv icon

Mixed Precision Training

Add code
Feb 15, 2018
Figure 1 for Mixed Precision Training
Figure 2 for Mixed Precision Training
Figure 3 for Mixed Precision Training
Figure 4 for Mixed Precision Training
Viaarxiv icon

Deep Learning Scaling is Predictable, Empirically

Add code
Dec 01, 2017
Figure 1 for Deep Learning Scaling is Predictable, Empirically
Figure 2 for Deep Learning Scaling is Predictable, Empirically
Figure 3 for Deep Learning Scaling is Predictable, Empirically
Figure 4 for Deep Learning Scaling is Predictable, Empirically
Viaarxiv icon

Block-Sparse Recurrent Neural Networks

Add code
Nov 08, 2017
Figure 1 for Block-Sparse Recurrent Neural Networks
Figure 2 for Block-Sparse Recurrent Neural Networks
Figure 3 for Block-Sparse Recurrent Neural Networks
Figure 4 for Block-Sparse Recurrent Neural Networks
Viaarxiv icon