Alert button
Picture for James Demmel

James Demmel

Alert button

Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping

Add code
Bookmark button
Alert button
Jun 24, 2023
Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel

Figure 1 for Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Figure 2 for Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Figure 3 for Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Figure 4 for Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Viaarxiv icon

Distributed-Memory Sparse Kernels for Machine Learning

Add code
Bookmark button
Alert button
Mar 18, 2022
Vivek Bharadwaj, Aydin Buluç, James Demmel

Figure 1 for Distributed-Memory Sparse Kernels for Machine Learning
Figure 2 for Distributed-Memory Sparse Kernels for Machine Learning
Figure 3 for Distributed-Memory Sparse Kernels for Machine Learning
Figure 4 for Distributed-Memory Sparse Kernels for Machine Learning
Viaarxiv icon

CoSA: Scheduling by Constrained Optimization for Spatial Accelerators

Add code
Bookmark button
Alert button
May 05, 2021
Qijing Huang, Minwoo Kang, Grace Dinh, Thomas Norell, Aravind Kalaiah, James Demmel, John Wawrzynek, Yakun Sophia Shao

Figure 1 for CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Figure 2 for CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Figure 3 for CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Figure 4 for CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Viaarxiv icon

Avoiding Communication in Logistic Regression

Add code
Bookmark button
Alert button
Nov 16, 2020
Aditya Devarakonda, James Demmel

Figure 1 for Avoiding Communication in Logistic Regression
Figure 2 for Avoiding Communication in Logistic Regression
Figure 3 for Avoiding Communication in Logistic Regression
Figure 4 for Avoiding Communication in Logistic Regression
Viaarxiv icon

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour

Add code
Bookmark button
Alert button
Nov 05, 2020
Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

Figure 1 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 2 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Figure 3 for Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Viaarxiv icon

83% ImageNet Accuracy in One Hour

Add code
Bookmark button
Alert button
Oct 30, 2020
Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

Figure 1 for 83% ImageNet Accuracy in One Hour
Figure 2 for 83% ImageNet Accuracy in One Hour
Figure 3 for 83% ImageNet Accuracy in One Hour
Viaarxiv icon

The Limit of the Batch Size

Add code
Bookmark button
Alert button
Jun 15, 2020
Yang You, Yuhui Wang, Huan Zhang, Zhao Zhang, James Demmel, Cho-Jui Hsieh

Figure 1 for The Limit of the Batch Size
Figure 2 for The Limit of the Batch Size
Figure 3 for The Limit of the Batch Size
Figure 4 for The Limit of the Batch Size
Viaarxiv icon

Auto-Precision Scaling for Distributed Deep Learning

Add code
Bookmark button
Alert button
Nov 20, 2019
Ruobing Han, Yang You, James Demmel

Figure 1 for Auto-Precision Scaling for Distributed Deep Learning
Figure 2 for Auto-Precision Scaling for Distributed Deep Learning
Figure 3 for Auto-Precision Scaling for Distributed Deep Learning
Figure 4 for Auto-Precision Scaling for Distributed Deep Learning
Viaarxiv icon

Reducing BERT Pre-Training Time from 3 Days to 76 Minutes

Add code
Bookmark button
Alert button
Apr 01, 2019
Yang You, Jing Li, Jonathan Hseu, Xiaodan Song, James Demmel, Cho-Jui Hsieh

Figure 1 for Reducing BERT Pre-Training Time from 3 Days to 76 Minutes
Figure 2 for Reducing BERT Pre-Training Time from 3 Days to 76 Minutes
Figure 3 for Reducing BERT Pre-Training Time from 3 Days to 76 Minutes
Figure 4 for Reducing BERT Pre-Training Time from 3 Days to 76 Minutes
Viaarxiv icon

Large-Batch Training for LSTM and Beyond

Add code
Bookmark button
Alert button
Jan 24, 2019
Yang You, Jonathan Hseu, Chris Ying, James Demmel, Kurt Keutzer, Cho-Jui Hsieh

Figure 1 for Large-Batch Training for LSTM and Beyond
Figure 2 for Large-Batch Training for LSTM and Beyond
Figure 3 for Large-Batch Training for LSTM and Beyond
Figure 4 for Large-Batch Training for LSTM and Beyond
Viaarxiv icon