Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A High-Throughput Solver for Marginalized Graph Kernels on GPU

Oct 16, 2019

Yu-Hang Tang, Oguz Selvitopi, Doru Popovici, Aydın Buluç

Figure 1 for A High-Throughput Solver for Marginalized Graph Kernels on GPU

Figure 2 for A High-Throughput Solver for Marginalized Graph Kernels on GPU

Figure 3 for A High-Throughput Solver for Marginalized Graph Kernels on GPU

Figure 4 for A High-Throughput Solver for Marginalized Graph Kernels on GPU

Share this with someone who'll enjoy it:

Abstract:We present the design and optimization of a solver for efficient and high-throughput computation of the marginalized graph kernel on General Purpose GPUs. The graph kernel is computed using the conjugate gradient method to solve a generalized Laplacian of the tensor product between a pair of graphs. To cope with the large gap between the instruction throughput and the memory bandwidth of the GPUs, our solver forms the graph tensor product on-the-fly without storing it in memory. This is achieved by using threads in a warp cooperatively to stream the adjacency and edge label matrices of individual graphs by small square matrix blocks called tiles, which are then staged in registers and the shared memory for later reuse. Warps across a thread block can further share tiles via the shared memory to increase data reuse. We exploit the sparsity of the graphs hierarchically by storing only non-empty tiles using a coordinate format and nonzero elements within each tile using bitmaps. We propose a new partition-based reordering algorithm for aggregating nonzero elements of the graphs into fewer but denser tiles to further exploit sparsity. We carry out extensive theoretical analyses on the graph tensor product primitives for tiles of various density and evaluate their performance on synthetic and real-world datasets. Our solver delivers three to four orders of magnitude speedup over existing CPU-based solvers such as GraKeL and GraphKernels. The capability of the solver enables kernel-based learning tasks at unprecedented scales.

View paper on

Share this with someone who'll enjoy it:

Title:A High-Throughput Solver for Marginalized Graph Kernels on GPU

Paper and Code