Picture for Barret Zoph

Barret Zoph

Shammie

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

Add code
May 24, 2023
Figure 1 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 2 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 3 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 4 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Viaarxiv icon

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Add code
May 22, 2023
Figure 1 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 2 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 3 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 4 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Viaarxiv icon

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Add code
Feb 14, 2023
Figure 1 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 2 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 3 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 4 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Viaarxiv icon

Scaling Instruction-Finetuned Language Models

Add code
Oct 20, 2022
Figure 1 for Scaling Instruction-Finetuned Language Models
Figure 2 for Scaling Instruction-Finetuned Language Models
Figure 3 for Scaling Instruction-Finetuned Language Models
Figure 4 for Scaling Instruction-Finetuned Language Models
Viaarxiv icon

A Review of Sparse Expert Models in Deep Learning

Add code
Sep 04, 2022
Figure 1 for A Review of Sparse Expert Models in Deep Learning
Figure 2 for A Review of Sparse Expert Models in Deep Learning
Figure 3 for A Review of Sparse Expert Models in Deep Learning
Figure 4 for A Review of Sparse Expert Models in Deep Learning
Viaarxiv icon

Emergent Abilities of Large Language Models

Add code
Jun 15, 2022
Figure 1 for Emergent Abilities of Large Language Models
Figure 2 for Emergent Abilities of Large Language Models
Figure 3 for Emergent Abilities of Large Language Models
Figure 4 for Emergent Abilities of Large Language Models
Viaarxiv icon

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Add code
Jun 10, 2022
Viaarxiv icon

PaLM: Scaling Language Modeling with Pathways

Add code
Apr 19, 2022
Figure 1 for PaLM: Scaling Language Modeling with Pathways
Figure 2 for PaLM: Scaling Language Modeling with Pathways
Figure 3 for PaLM: Scaling Language Modeling with Pathways
Figure 4 for PaLM: Scaling Language Modeling with Pathways
Viaarxiv icon

Designing Effective Sparse Expert Models

Add code
Feb 17, 2022
Figure 1 for Designing Effective Sparse Expert Models
Figure 2 for Designing Effective Sparse Expert Models
Figure 3 for Designing Effective Sparse Expert Models
Figure 4 for Designing Effective Sparse Expert Models
Viaarxiv icon

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Add code
Dec 13, 2021
Figure 1 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 2 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 3 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 4 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Viaarxiv icon