Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning


Mar 15, 2023
Quentin Anthony, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar Panda

Add code

* Accepted, to be presented at IPDPS 2023 

   Access Paper or Ask Questions

A Novel Tensor-Expert Hybrid Parallelism Approach to Scale Mixture-of-Experts Training


Mar 11, 2023
Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, Abhinav Bhatele

Add code


   Access Paper or Ask Questions

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale


Jun 30, 2022
Reza Yazdani Aminabadi, Samyam Rajbhandari, Minjia Zhang, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Jeff Rasley, Shaden Smith, Olatunji Ruwase, Yuxiong He

Add code


   Access Paper or Ask Questions

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale


Jan 14, 2022
Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He

Add code


   Access Paper or Ask Questions

Scalable and Efficient MoE Training for Multitask Multilingual Models


Sep 22, 2021
Young Jin Kim, Ammar Ahmad Awan, Alexandre Muzio, Andres Felipe Cruz Salinas, Liyang Lu, Amr Hendy, Samyam Rajbhandari, Yuxiong He, Hany Hassan Awadalla

Add code


   Access Paper or Ask Questions

1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed


Apr 13, 2021
Conglong Li, Ammar Ahmad Awan, Hanlin Tang, Samyam Rajbhandari, Yuxiong He

Add code


   Access Paper or Ask Questions

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed


Feb 04, 2021
Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He

Add code

* arXiv admin note: text overlap with arXiv:2008.11343 

   Access Paper or Ask Questions

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow


Nov 12, 2019
Ammar Ahmad Awan, Arpan Jain, Quentin Anthony, Hari Subramoni, Dhabaleswar K., Panda

Add code

* 15 pages, 16 figures, under double-blind review at a conference 

   Access Paper or Ask Questions