Picture for Orhan Firat

Orhan Firat

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

Add code
Jun 30, 2020
Figure 1 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 2 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 3 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 4 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Viaarxiv icon

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Add code
May 11, 2020
Figure 1 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 2 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 3 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Figure 4 for Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Viaarxiv icon

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

Add code
Apr 10, 2020
Figure 1 for XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Figure 2 for XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Figure 3 for XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Figure 4 for XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Viaarxiv icon

On the Discrepancy between Density Estimation and Sequence Generation

Add code
Feb 17, 2020
Figure 1 for On the Discrepancy between Density Estimation and Sequence Generation
Figure 2 for On the Discrepancy between Density Estimation and Sequence Generation
Figure 3 for On the Discrepancy between Density Estimation and Sequence Generation
Figure 4 for On the Discrepancy between Density Estimation and Sequence Generation
Viaarxiv icon

Controlling Computation versus Quality for Neural Sequence Models

Add code
Feb 17, 2020
Figure 1 for Controlling Computation versus Quality for Neural Sequence Models
Figure 2 for Controlling Computation versus Quality for Neural Sequence Models
Figure 3 for Controlling Computation versus Quality for Neural Sequence Models
Figure 4 for Controlling Computation versus Quality for Neural Sequence Models
Viaarxiv icon

Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation

Add code
Oct 30, 2019
Figure 1 for Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Figure 2 for Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Figure 3 for Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Figure 4 for Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Viaarxiv icon

On the Importance of Word Boundaries in Character-level Neural Machine Translation

Add code
Oct 21, 2019
Figure 1 for On the Importance of Word Boundaries in Character-level Neural Machine Translation
Figure 2 for On the Importance of Word Boundaries in Character-level Neural Machine Translation
Figure 3 for On the Importance of Word Boundaries in Character-level Neural Machine Translation
Figure 4 for On the Importance of Word Boundaries in Character-level Neural Machine Translation
Viaarxiv icon

Simple, Scalable Adaptation for Neural Machine Translation

Add code
Sep 18, 2019
Figure 1 for Simple, Scalable Adaptation for Neural Machine Translation
Figure 2 for Simple, Scalable Adaptation for Neural Machine Translation
Figure 3 for Simple, Scalable Adaptation for Neural Machine Translation
Figure 4 for Simple, Scalable Adaptation for Neural Machine Translation
Viaarxiv icon

Adaptive Scheduling for Multi-Task Learning

Add code
Sep 13, 2019
Figure 1 for Adaptive Scheduling for Multi-Task Learning
Figure 2 for Adaptive Scheduling for Multi-Task Learning
Figure 3 for Adaptive Scheduling for Multi-Task Learning
Figure 4 for Adaptive Scheduling for Multi-Task Learning
Viaarxiv icon

Investigating Multilingual NMT Representations at Scale

Add code
Sep 11, 2019
Figure 1 for Investigating Multilingual NMT Representations at Scale
Figure 2 for Investigating Multilingual NMT Representations at Scale
Figure 3 for Investigating Multilingual NMT Representations at Scale
Figure 4 for Investigating Multilingual NMT Representations at Scale
Viaarxiv icon