Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model



Shaden Smith , Mostofa Patwary , Brandon Norick , Patrick LeGresley , Samyam Rajbhandari , Jared Casper , Zhun Liu , Shrimai Prabhumoye , George Zerveas , Vijay Korthikanti , Elton Zhang , Rewon Child , Reza Yazdani Aminabadi , Julie Bernauer , Xia Song , Mohammad Shoeybi , Yuxiong He , Michael Houston , Saurabh Tiwary , Bryan Catanzaro

* Shaden Smith and Mostofa Patwary contributed equally 

   Access Paper or Ask Questions

Highly-scalable, physics-informed GANs for learning solutions of stochastic PDEs



Liu Yang , Sean Treichler , Thorsten Kurth , Keno Fischer , David Barajas-Solano , Josh Romero , Valentin Churavy , Alexandre Tartakovsky , Michael Houston , Prabhat , George Karniadakis

* 3rd Deep Learning on Supercomputers Workshop (DLS) at SC19 

   Access Paper or Ask Questions

Mixed Precision Training



Paulius Micikevicius , Sharan Narang , Jonah Alben , Gregory Diamos , Erich Elsen , David Garcia , Boris Ginsburg , Michael Houston , Oleksii Kuchaiev , Ganesh Venkatesh , Hao Wu

* Published as a conference paper at ICLR 2018 

   Access Paper or Ask Questions