Alert button
Picture for Alexander Heinecke

Alexander Heinecke

Alert button

Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures

Add code
Bookmark button
Alert button
May 10, 2020
Dhiraj Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke

Figure 1 for Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Figure 2 for Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Figure 3 for Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Figure 4 for Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Viaarxiv icon

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

Add code
Bookmark button
Alert button
Feb 06, 2020
Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul

Figure 1 for PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Figure 2 for PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Figure 3 for PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Figure 4 for PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Viaarxiv icon

Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)

Add code
Bookmark button
Alert button
Nov 05, 2019
Amelia Drew, Alexander Heinecke

Figure 1 for Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)
Figure 2 for Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)
Viaarxiv icon

High-Performance Deep Learning via a Single Building Block

Add code
Bookmark button
Alert button
Jun 18, 2019
Evangelos Georganas, Kunal Banerjee, Dhiraj Kalamkar, Sasikanth Avancha, Anand Venkat, Michael Anderson, Greg Henry, Hans Pabst, Alexander Heinecke

Figure 1 for High-Performance Deep Learning via a Single Building Block
Figure 2 for High-Performance Deep Learning via a Single Building Block
Figure 3 for High-Performance Deep Learning via a Single Building Block
Figure 4 for High-Performance Deep Learning via a Single Building Block
Viaarxiv icon

A Study of BFLOAT16 for Deep Learning Training

Add code
Bookmark button
Alert button
Jun 13, 2019
Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey

Figure 1 for A Study of BFLOAT16 for Deep Learning Training
Figure 2 for A Study of BFLOAT16 for Deep Learning Training
Figure 3 for A Study of BFLOAT16 for Deep Learning Training
Figure 4 for A Study of BFLOAT16 for Deep Learning Training
Viaarxiv icon

ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler

Add code
Bookmark button
Alert button
Oct 12, 2018
Matthew Sotoudeh, Anand Venkat, Michael Anderson, Evangelos Georganas, Alexander Heinecke, Jason Knight

Figure 1 for ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler
Figure 2 for ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler
Figure 3 for ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler
Figure 4 for ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler
Viaarxiv icon

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

Add code
Bookmark button
Alert button
Feb 23, 2018
Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov

Figure 1 for Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Figure 2 for Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Figure 3 for Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Figure 4 for Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Viaarxiv icon