Alert button
Picture for Quoc V. Le

Quoc V. Le

Alert button

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Add code
Bookmark button
Alert button
May 09, 2019
Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith

Figure 1 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 2 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 3 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 4 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Viaarxiv icon

Unsupervised Data Augmentation

Add code
Bookmark button
Alert button
Apr 29, 2019
Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

Figure 1 for Unsupervised Data Augmentation
Figure 2 for Unsupervised Data Augmentation
Figure 3 for Unsupervised Data Augmentation
Figure 4 for Unsupervised Data Augmentation
Viaarxiv icon

Attention Augmented Convolutional Networks

Add code
Bookmark button
Alert button
Apr 22, 2019
Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, Quoc V. Le

Figure 1 for Attention Augmented Convolutional Networks
Figure 2 for Attention Augmented Convolutional Networks
Figure 3 for Attention Augmented Convolutional Networks
Figure 4 for Attention Augmented Convolutional Networks
Viaarxiv icon

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Apr 18, 2019
Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le

Figure 1 for SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Figure 2 for SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Figure 3 for SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Figure 4 for SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Viaarxiv icon

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Add code
Bookmark button
Alert button
Apr 16, 2019
Golnaz Ghiasi, Tsung-Yi Lin, Ruoming Pang, Quoc V. Le

Figure 1 for NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Figure 2 for NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Figure 3 for NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Figure 4 for NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Viaarxiv icon

Soft Conditional Computation

Add code
Bookmark button
Alert button
Apr 10, 2019
Brandon Yang, Gabriel Bender, Quoc V. Le, Jiquan Ngiam

Figure 1 for Soft Conditional Computation
Figure 2 for Soft Conditional Computation
Figure 3 for Soft Conditional Computation
Figure 4 for Soft Conditional Computation
Viaarxiv icon

The Evolved Transformer

Add code
Bookmark button
Alert button
Feb 15, 2019
David R. So, Chen Liang, Quoc V. Le

Figure 1 for The Evolved Transformer
Figure 2 for The Evolved Transformer
Figure 3 for The Evolved Transformer
Figure 4 for The Evolved Transformer
Viaarxiv icon

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Add code
Bookmark button
Alert button
Jan 18, 2019
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov

Figure 1 for Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Figure 2 for Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Figure 3 for Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Figure 4 for Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Viaarxiv icon

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Add code
Bookmark button
Alert button
Dec 12, 2018
Yanping Huang, Yonglong Cheng, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Zhifeng Chen

Figure 1 for GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Figure 2 for GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Figure 3 for GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Figure 4 for GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Viaarxiv icon