Skip to content

Latest commit

 

History

History
52 lines (45 loc) · 3.06 KB

README.md

File metadata and controls

52 lines (45 loc) · 3.06 KB

DistributedGNN

Introduction

DistrGNN is a research project supported by a scholarship from the University of Pisa. It focuses on investigating and implementing pipeline parallelism in Graph Neural Networks (GNNs). The study case is based on the methodology outlined in the paper Vision GNN: An Image is Worth Graph of Nodes by Han et al. (2022), presented at Advances in Neural Information Processing Systems (NeurIPS). The project explores the efficient distribution of GNN computations across multiple computing nodes, with a particular emphasis on pipeline parallelism. Additionally, it examines the combination of pipeline parallelism and data parallelism to optimize performance and scalability in large-scale GNN training tasks.

More details on this study, including technical insights and experimental results, are available in the project's report report.pdf.

Repository Structure

The project code is located in the src directory. Here's a breakdown of the key files:

  • model.py: The file contains the implementation of the model (both the sequential and the "pipeline" version).
  • seq.py: Script for running the sequential GNN model.
  • pipe.py: Script for running the GNN model with pipeline parallelism.
  • data_pipe.py: Script for running the GNN model with combined data and pipeline parallelism.
  • report.pdf: Project report with detailed technical insights and experimental results.

Installation

  1. Clone the repository to your local machine:
    git clone https://github.com/JacopoRaffi/DistributedGNN.git
    cd DistriutedGNN
  2. Install all the dependencies:
    pip install -r requirements.txt

Usage

Sequential

Execution of the sequential model

   cd src
   python3 seq.py --filename log_file.csv

Pipeline Parallelism

Example of execution of a 2-stage pipeline

   cd src
   torchrun --nproc_per_node=1 --nnodes=2 pipe.py --filename log_file.csv

Data + Pipeline Parallelism

Executing the combination wit two copies and each copies splitted in a 2-stage Pipeline

   cd src
   torchrun --nproc_per_node=1 --nnodes=4 data_pipe.py --filename log_file.csv

Acknowledgements

I would like to thank Prof. Patrizio Dazzi and the University of Pisa for this opportunity.

References

[1] Han, K., Wang, Y., Guo, J., Tang, Y., & Wu, E. (2022). Vision GNN: An image is worth a graph of nodes. In Advances in Neural Information Processing Systems (Vol. 35, pp. 8291-8303) [Curran Associates, Inc.]. https://proceedings.neurips.cc/paper_files/paper/2022/hash/3743e69c8e47eb2e6d3afaea80e439fb-Abstract-Conference.html

[2] Huang, Y., Cheng, Y., Bapna, A., Firat, O., Chen, D., Chen, M., Lee, H., Ngiam, J., Le, Q. V., Wu, Y., & Chen, Z. (2019). GPipe: Efficient training of giant neural networks using pipeline parallelism. In Advances in Neural Information Processing Systems (Vol. 32) [Curran Associates, Inc.]. https://arxiv.org/abs/1811.06965