Picture for Quoc Le

Quoc Le

Mixture-of-Experts with Expert Choice Routing

Add code
Feb 18, 2022
Figure 1 for Mixture-of-Experts with Expert Choice Routing
Figure 2 for Mixture-of-Experts with Expert Choice Routing
Figure 3 for Mixture-of-Experts with Expert Choice Routing
Figure 4 for Mixture-of-Experts with Expert Choice Routing
Viaarxiv icon

LaMDA: Language Models for Dialog Applications

Add code
Feb 10, 2022
Figure 1 for LaMDA: Language Models for Dialog Applications
Figure 2 for LaMDA: Language Models for Dialog Applications
Figure 3 for LaMDA: Language Models for Dialog Applications
Figure 4 for LaMDA: Language Models for Dialog Applications
Viaarxiv icon

Chain of Thought Prompting Elicits Reasoning in Large Language Models

Add code
Jan 28, 2022
Figure 1 for Chain of Thought Prompting Elicits Reasoning in Large Language Models
Figure 2 for Chain of Thought Prompting Elicits Reasoning in Large Language Models
Figure 3 for Chain of Thought Prompting Elicits Reasoning in Large Language Models
Figure 4 for Chain of Thought Prompting Elicits Reasoning in Large Language Models
Viaarxiv icon

Program Synthesis with Large Language Models

Add code
Aug 16, 2021
Figure 1 for Program Synthesis with Large Language Models
Figure 2 for Program Synthesis with Large Language Models
Figure 3 for Program Synthesis with Large Language Models
Figure 4 for Program Synthesis with Large Language Models
Viaarxiv icon

A Full-stack Accelerator Search Technique for Vision Applications

Add code
May 26, 2021
Figure 1 for A Full-stack Accelerator Search Technique for Vision Applications
Figure 2 for A Full-stack Accelerator Search Technique for Vision Applications
Figure 3 for A Full-stack Accelerator Search Technique for Vision Applications
Figure 4 for A Full-stack Accelerator Search Technique for Vision Applications
Viaarxiv icon

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

Add code
Apr 27, 2021
Figure 1 for SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
Figure 2 for SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
Viaarxiv icon

Carbon Emissions and Large Neural Network Training

Add code
Apr 23, 2021
Figure 1 for Carbon Emissions and Large Neural Network Training
Figure 2 for Carbon Emissions and Large Neural Network Training
Figure 3 for Carbon Emissions and Large Neural Network Training
Figure 4 for Carbon Emissions and Large Neural Network Training
Viaarxiv icon

Searching for Fast Model Families on Datacenter Accelerators

Add code
Feb 10, 2021
Figure 1 for Searching for Fast Model Families on Datacenter Accelerators
Figure 2 for Searching for Fast Model Families on Datacenter Accelerators
Figure 3 for Searching for Fast Model Families on Datacenter Accelerators
Figure 4 for Searching for Fast Model Families on Datacenter Accelerators
Viaarxiv icon

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour

Add code
Nov 05, 2020
Figure 1 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 2 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 3 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Viaarxiv icon

Efficient Scale-Permuted Backbone with Learned Resource Distribution

Add code
Oct 22, 2020
Figure 1 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 2 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 3 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 4 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Viaarxiv icon