Alert button
Picture for Haibin Lin

Haibin Lin

Alert button

CSER: Communication-efficient SGD with Error Reset

Jul 26, 2020
Cong Xie, Shuai Zheng, Oluwasanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin

Figure 1 for CSER: Communication-efficient SGD with Error Reset
Figure 2 for CSER: Communication-efficient SGD with Error Reset
Figure 3 for CSER: Communication-efficient SGD with Error Reset
Figure 4 for CSER: Communication-efficient SGD with Error Reset
Viaarxiv icon

Is Network the Bottleneck of Distributed Training?

Jun 24, 2020
Zhen Zhang, Chaokun Chang, Haibin Lin, Yida Wang, Raman Arora, Xin Jin

Figure 1 for Is Network the Bottleneck of Distributed Training?
Figure 2 for Is Network the Bottleneck of Distributed Training?
Figure 3 for Is Network the Bottleneck of Distributed Training?
Figure 4 for Is Network the Bottleneck of Distributed Training?
Viaarxiv icon

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

Jun 24, 2020
Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li

Figure 1 for Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Figure 2 for Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Figure 3 for Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Viaarxiv icon

ResNeSt: Split-Attention Networks

Apr 19, 2020
Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola

Figure 1 for ResNeSt: Split-Attention Networks
Figure 2 for ResNeSt: Split-Attention Networks
Figure 3 for ResNeSt: Split-Attention Networks
Figure 4 for ResNeSt: Split-Attention Networks
Viaarxiv icon

Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates

Nov 20, 2019
Cong Xie, Oluwasanmi Koyejo, Indranil Gupta, Haibin Lin

Figure 1 for Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Figure 2 for Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Figure 3 for Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Figure 4 for Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Viaarxiv icon

Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs

Sep 03, 2019
Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, Ziyue Huang, Qipeng Guo, Hao Zhang, Haibin Lin, Junbo Zhao, Jinyang Li, Alexander Smola, Zheng Zhang

Figure 1 for Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs
Figure 2 for Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs
Viaarxiv icon

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

Jul 09, 2019
Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng

Figure 1 for GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
Viaarxiv icon

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

May 02, 2019
Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

Figure 1 for Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Figure 2 for Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Figure 3 for Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Figure 4 for Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Viaarxiv icon