Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

FP8 Formats for Deep Learning


Sep 12, 2022
Paulius Micikevicius, Dusan Stosic, Neil Burgess, Marius Cornea, Pradeep Dubey, Richard Grisenthwaite, Sangwon Ha, Alexander Heinecke, Patrick Judd, John Kamalu, Naveen Mellempudi, Stuart Oberman, Mohammad Shoeybi, Michael Siu, Hao Wu

Add code


   Access Paper or Ask Questions

FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems


Apr 22, 2022
Rui Ma, Evangelos Georganas, Alexander Heinecke, Andrew Boutros, Eriko Nurvitadhi

Add code

* 5 pages, 4 figures 

   Access Paper or Ask Questions

DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks


Apr 16, 2021
Vasimuddin Md, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty, Evangelos Georganas, Alexander Heinecke, Dhiraj Kalamkar, Nesreen K. Ahmed, Sasikanth Avancha

Add code


   Access Paper or Ask Questions

Efficient and Generic 1D Dilated Convolution Layer for Deep Learning


Apr 16, 2021
Narendra Chaudhary, Sanchit Misra, Dhiraj Kalamkar, Alexander Heinecke, Evangelos Georganas, Barukh Ziv, Menachem Adelman, Bharat Kaul

Add code


   Access Paper or Ask Questions

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads


Apr 14, 2021
Evangelos Georganas, Dhiraj Kalamkar, Sasikanth Avancha, Menachem Adelman, Cristina Anderson, Alexander Breuer, Narendra Chaudhary, Abhisek Kundu, Vasimuddin Md, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Barukh Ziv, Alexander Heinecke

Add code


   Access Paper or Ask Questions

PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives


Jun 02, 2020
Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul

Add code

* arXiv admin note: substantial text overlap with arXiv:2002.02145 

   Access Paper or Ask Questions

Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures


May 10, 2020
Dhiraj Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke

Add code


   Access Paper or Ask Questions

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives


Feb 06, 2020
Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul

Add code


   Access Paper or Ask Questions

1
2
>>