This is the KDD'25 repository for submitting paper CompressGNN: Accelerating Graph Neural Network Training via Hierarchical Compression.
The project's code base is organized in the following directory structure.
.
├── benchmark
├── dataset
├── experiment
├── genData
├── install.sh
├── layer
├── LICENSE
├── loader
├── model
├── README.md
├── requirements.txt
├── src
├── test
└── third_partyTo use this repository, please follow the steps below to install the required dependencies.
- Python (version 3.8.13)
- pip (version 23.0.1)
- cuda (version 11.6)
-
Clone the repository
-
Navigate to the project directory
-
Install the required dependencies using pip:
pip install -r requirements.txtThis command will install all the necessary libraries and packages, including:
- numpy
- pytorch
- Pytorch Geometric (PyG)
- Deep Graph Library (DGL)
- torch_scatter
- torch_sparse
- pybind
- ...
- Install CompressGNN
bash install.shWe have uploaded the small dataset Cora and cnr-2000.
Users can generate a dataset in a format that meets our data specifications from WebGraph and PyTorch Geometric.
.
├── csr_elist.npy
├── csr_vlist.npy
├── edge.npy
├── features.npy
├── labels.npy
├── test_mask.npy
├── train_mask.npy
└── val_mask.npycd genData
python createTorchDataset.py <input data folder> <output data folder> coo/csrcd genData
python createCompressDataset <input data folder> <output data folder> coo/csr cd genData
python datagen_feature.py --data=xxx.pt --scale_factor=length --output=xxx.ptcd genData
bash preprocess.sh
bash preprocess_compress.sh
bash datagen_feature.sh
cd benchmark/end2end
bash run.sh- Speedup
cd benchmark/propagate/propagate
bash speedup.sh- Performance with different feature dimension
cd benchmark/propagate/propagate
bash feature_scale.sh- Peak memory
cd benchmark/propagate/peak_memory
bash run.sh- Time and accuracy
cd benchmark/transform/time_accu
bash run.sh- Time breakdown
cd benchmark/transform/time_breakdown
bash run.sh