Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Dec 16, 2021

Tianfeng Liu, Yangrui Chen, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo

Figure 1 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Figure 2 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Figure 3 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Figure 4 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Share this with someone who'll enjoy it:

Abstract:Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data, achieving ground-breaking performance on various tasks such as node classification and graph property prediction. Nonetheless, existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs. The main bottlenecks are the process of preparing data for GPUs - subgraph sampling and feature retrieving. This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas. First, we propose a dynamic cache engine to minimize feature retrieving traffic. By a co-design of caching policy and the order of sampling, we find a sweet spot of low overhead and high cache hit ratio. Second, we improve the graph partition algorithm to reduce cross-partition communication during subgraph sampling. Finally, careful resource isolation reduces contention between different data preprocessing stages. Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 20.68x on average.

* Under Review

View paper on

Share this with someone who'll enjoy it:

Title:BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Paper and Code