Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

Last update: Dec 25, 2022

Overview

PyGAS: Auto-Scaling GNNs in PyG

PyGAS is the practical realization of our GNNAutoScale (GAS) framework, which scales arbitrary message-passing GNNs to large graphs, as described in our paper:

Matthias Fey, Jan E. Lenssen, Frank Weichert, Jure Leskovec: GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings (ICML 2021)

GAS prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations, leading to constant GPU memory consumption in respect to input mini-batch size, and maximally expressivity.

PyGAS is implemented in PyTorch and utilizes the PyTorch Geometric (PyG) library. It provides an easy-to-use interface to convert a common or custom GNN from PyG into its scalable variant:

from torch_geometric.nn import SAGEConv
from torch_geometric_autoscale import ScalableGNN
from torch_geometric_autoscale import metis, permute, SubgraphLoader

class GNN(ScalableGNN):
    def __init__(self, num_nodes, in_channels, hidden_channels,
                 out_channels, num_layers):
        # * pool_size determines the number of pinned CPU buffers
        # * buffer_size determines the size of pinned CPU buffers,
        #   i.e. the maximum number of out-of-mini-batch nodes

        super().__init__(num_nodes, hidden_channels, num_layers,
                         pool_size=2, buffer_size=5000)

        self.convs = ModuleList()
        self.convs.append(SAGEConv(in_channels, hidden_channels))
        for _ in range(num_layers - 2):
            self.convs.append(SAGEConv(hidden_channels, hidden_channels))
        self.convs.append(SAGEConv(hidden_channels, out_channels))

    def forward(self, x, adj_t, *args):
        for conv, history in zip(self.convs[:-1], self.histories):
            x = conv(x, adj_t).relu_()
            x = self.push_and_pull(history, x, *args)
        return self.convs[-1](x, adj_t)

perm, ptr = metis(data.adj_t, num_parts=40, log=True)
data = permute(data, perm, log=True)
loader = SubgraphLoader(data, ptr, batch_size=10, shuffle=True)

model = GNN(...)
for batch, *args in loader:
    out = model(batch.x, batch.adj_t, *args)

A detailed description of ScalableGNN can be found in its implementation.

Requirements

Install PyTorch >= 1.7.0
Install PyTorch Geometric >= 1.7.0:

pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric

where ${TORCH} should be replaced by either 1.7.0 or 1.8.0, and ${CUDA} should be replaced by either cpu, cu92, cu101, cu102, cu110 or cu111, depending on your PyTorch installation.

Installation

pip install git+https://github.com/rusty1s/pyg_autoscale.git

python setup.py install

Project Structure

torch_geometric_autoscale/ contains the source code of PyGAS
examples/ contains examples to demonstrate how to apply GAS in practice
small_benchmark/ includes experiments to evaluate GAS performance on small-scale graphs
large_benchmark/ includes experiments to evaluate GAS performance on large-scale graphs

We use Hydra to manage hyperparameter configurations.

Cite

Please cite our paper if you use this code in your own work:

@inproceedings{Fey/etal/2021,
  title={{GNNAutoScale}: Scalable and Expressive Graph Neural Networks via Historical Embeddings},
  author={Fey, M. and Lenssen, J. E. and Weichert, F. and Leskovec, J.},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2021},
}

Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

Related tags

Overview

PyGAS: Auto-Scaling GNNs in PyG

Requirements

Installation

Project Structure

Cite

Owner

Matthias Fey

NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

Prototypical Networks for Few shot Learning in PyTorch

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

Python KNN model: Predicting a probability of getting a work visa. Tableau: Non-immigrant visas over the years.

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Compressed Video Action Recognition

Continuum Learning with GEM: Gradient Episodic Memory

Research on controller area network Intrusion Detection Systems

An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Chess reinforcement learning by AlphaGo Zero methods.

Voxel Transformer for 3D object detection

Deep learning based hand gesture recognition using LSTM and MediaPipie.

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

Multitask Learning Strengthens Adversarial Robustness

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

A pytorch-based real-time segmentation model for autonomous driving

Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

This repo contains code to reproduce all experiments in Equivariant Neural Rendering