Alert button

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

May 24, 2023
Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, Jason Wei, Hyung Won Chung, Barret Zoph, William Fedus, Xinyun Chen, Tu Vu, Yuexin Wu, Wuyang Chen, Albert Webson, Yunxuan Li, Vincent Zhao, Hongkun Yu, Kurt Keutzer, Trevor Darrell, Denny Zhou

Figure 1 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 2 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 3 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
Figure 4 for Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: