Alert button
Picture for Sameer Kumar

Sameer Kumar

Alert button

Designing Effective Sparse Expert Models

Add code
Bookmark button
Alert button
Feb 17, 2022
Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, William Fedus

Figure 1 for Designing Effective Sparse Expert Models
Figure 2 for Designing Effective Sparse Expert Models
Figure 3 for Designing Effective Sparse Expert Models
Figure 4 for Designing Effective Sparse Expert Models
Viaarxiv icon

Exploring the limits of Concurrency in ML Training on Google TPUs

Add code
Bookmark button
Alert button
Nov 07, 2020
Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Figure 1 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 2 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 3 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 4 for Exploring the limits of Concurrency in ML Training on Google TPUs
Viaarxiv icon

Highly Available Data Parallel ML training on Mesh Networks

Add code
Bookmark button
Alert button
Nov 06, 2020
Sameer Kumar, Norm Jouppi

Figure 1 for Highly Available Data Parallel ML training on Mesh Networks
Figure 2 for Highly Available Data Parallel ML training on Mesh Networks
Figure 3 for Highly Available Data Parallel ML training on Mesh Networks
Figure 4 for Highly Available Data Parallel ML training on Mesh Networks
Viaarxiv icon

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour

Add code
Bookmark button
Alert button
Nov 05, 2020
Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

Figure 1 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 2 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 3 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Viaarxiv icon

83% ImageNet Accuracy in One Hour

Add code
Bookmark button
Alert button
Oct 30, 2020
Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

Figure 1 for 83% ImageNet Accuracy in One Hour
Figure 2 for 83% ImageNet Accuracy in One Hour
Figure 3 for 83% ImageNet Accuracy in One Hour
Viaarxiv icon

Scale MLPerf-0.6 models on Google TPU-v3 Pods

Add code
Bookmark button
Alert button
Oct 02, 2019
Sameer Kumar, Victor Bitorff, Dehao Chen, Chiachen Chou, Blake Hechtman, HyoukJoong Lee, Naveen Kumar, Peter Mattson, Shibo Wang, Tao Wang, Yuanzhong Xu, Zongwei Zhou

Figure 1 for Scale MLPerf-0.6 models on Google TPU-v3 Pods
Figure 2 for Scale MLPerf-0.6 models on Google TPU-v3 Pods
Figure 3 for Scale MLPerf-0.6 models on Google TPU-v3 Pods
Figure 4 for Scale MLPerf-0.6 models on Google TPU-v3 Pods
Viaarxiv icon

Image Classification at Supercomputer Scale

Add code
Bookmark button
Alert button
Dec 02, 2018
Chris Ying, Sameer Kumar, Dehao Chen, Tao Wang, Youlong Cheng

Figure 1 for Image Classification at Supercomputer Scale
Figure 2 for Image Classification at Supercomputer Scale
Figure 3 for Image Classification at Supercomputer Scale
Figure 4 for Image Classification at Supercomputer Scale
Viaarxiv icon

PowerAI DDL

Add code
Bookmark button
Alert button
Aug 07, 2017
Minsik Cho, Ulrich Finkler, Sameer Kumar, David Kung, Vaibhav Saxena, Dheeraj Sreedhar

Figure 1 for PowerAI DDL
Figure 2 for PowerAI DDL
Figure 3 for PowerAI DDL
Figure 4 for PowerAI DDL
Viaarxiv icon