Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Time-Based Roofline for Deep Learning Performance Analysis

Sep 22, 2020

Yunsong Wang, Charlene Yang, Steven Farrell, Yan Zhang, Thorsten Kurth, Samuel Williams

Figure 1 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 2 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 3 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 4 for Time-Based Roofline for Deep Learning Performance Analysis

Share this with someone who'll enjoy it:

Abstract:Deep learning applications are usually very compute-intensive and require a long run time for training and inference. This has been tackled by researchers from both hardware and software sides, and in this paper, we propose a Roofline-based approach to performance analysis to facilitate the optimization of these applications. This approach is an extension of the Roofline model widely used in traditional high-performance computing applications, and it incorporates both compute/bandwidth complexity and run time in its formulae to provide insights into deep learning-specific characteristics. We take two sets of representative kernels, 2D convolution and long short-term memory, to validate and demonstrate the use of this new approach, and investigate how arithmetic intensity, cache locality, auto-tuning, kernel launch overhead, and Tensor Core usage can affect performance. Compared to the common ad-hoc approach, this study helps form a more systematic way to analyze code performance and identify optimization opportunities for deep learning applications.

* 9 pages

View paper on

Share this with someone who'll enjoy it:

Title:Time-Based Roofline for Deep Learning Performance Analysis

Paper and Code