Luc Gaitskell
Graph Sampling Acceleration for GNN Learning on Multi-GPU Systems
2024–2025
Electrical Engineering and Computer Science
- Theory of Computation
Charles E. Leiserson
Graph Neural Networks (GNNs) have been discovered to model certain systems with particularly high accuracy, such as chemical molecules and relationship networks. However, due to the scale of individual input graphs in useful training sets, they cannot be processed on current hardware without sub-sampling. The intensive graph sampling process can be seen to take around 50% of training time. As a result, recent research has been conducted into methods to improve sampling efficiency, which allows training with more data and gives headroom for larger models. However, this research has yet to focus extensively on effectively utilizing multi-GPU training systems. Efficiently distributing sampling across GPUs could deliver further performance improvements, enabling larger, more capable models.
Through this SuperUROP, I’m looking forward to applying my experience in machine learning and hardware architecture from previous courses and working in the industry. With the least of my experience in software performance engineering, I look forward to learning through my research and alongside 6.1060 next semester. I hope to expand my understanding of all three areas by exploring the intersection.