Scalable Graph Neural Networks for Heterogeneous Graphs

Related tags

Deep LearningNARS
Overview

Neighbor Averaging over Relation Subgraphs (NARS)

NARS is an algorithm for node classification on heterogeneous graphs, based on scalable neighbor averaging techniques that have been previously used in e.g. SIGN to heterogeneous scenarios by generating neighbor-averaged features on sampled relation induced subgraphs.

For more details, please check out our paper:

Scalable Graph Neural Networks for Heterogeneous Graphs

Setup

Dependencies

  • torch==1.5.1+cu101
  • dgl-cu101==0.4.3.post2
  • ogb==1.2.1
  • dglke==0.1.0

Docker

We have prepared a dockerfile for building a container with clean environment and all required dependencies. Please checkout instructions in docker.

Data Preparation

Download and pre-process OAG dataset (optional)

If you plan to evaluate on OAG dataset, you need to follow instructions in oag_dataset to download and pre-process dataset.

Generate input for featureless node types

In academic graph datasets (ACM, MAG, OAG) in which only paper nodes are associated with input features. NARS featurizes other node types with TransE relational graph embedding pre-trained on the graph structure.

Please follow instructions in graph_embed to generate embeddings for each dataset.

Sample relation subsets

NARS samples Relation Subsets (see our paper for details). Please follow the instructions in sample_relation_subsets to generate these subsets.

Or you may skip this step and use the example subsets that have added to this repository.

Run NARS Experiments

NARS are evaluated on three academic graph datasets to predict publishing venues and fields of papers.

ACM

python3 train.py --dataset acm --use-emb TransE_acm --R 2 \
    --use-relation-subsets sample_relation_subsets/examples/acm \
    --num-hidden 64 --lr 0.003 --dropout 0.7 --eval-every 1 \
    --num-epochs 100 --input-dropout

OGBN-MAG

python3 train.py --dataset mag --use-emb TransE_mag --R 5 \
    --use-relation-subset sample_relation_subsets/examples/mag \
    --eval-batch-size 50000 --num-hidden 512 --lr 0.001 --batch-s 50000 \
    --dropout 0.5 --num-epochs 1000

OAG (venue prediction)

python3 train.py --dataset oag_venue --use-emb TransE_oag_venue --R 3 \
    --use-relation-subsets sample_relation_subsets/examples/oag_venue \
    --eval-batch-size 25000 --num-hidden 256 --lr 0.001 --batch-size 1000 \
    --data-dir oag_dataset --dropout 0.5 --num-epochs 200

OAG (L1-field prediction)

python3 train.py --dataset oag_L1 --use-emb TransE_oag_L1 --R 3 \
    --use-relation-subsets sample_relation_subsets/examples/oag_L1 \
    --eval-batch-size 25000 --num-hidden 256 --lr 0.001 --batch-size 1000 \
    --data-dir oag_dataset --dropout 0.5 --num-epochs 200

Results

Here is a summary of model performance using example relation subsets:

For ACM and OGBN-MAG dataset, the task is to predict paper publishing venue.

Dataset # Params Test Accuracy
ACM 0.40M 0.9305±0.0043
OGBN-MAG 4.13M 0.5240±0.0016

For OAG dataset, there are two different node predictions tasks: predicting venue (single-label) and L1-field (multi-label). And we follow Heterogeneous Graph Transformer to evaluate using NDCG and MRR metrics.

Task # Params NDCG MRR
Venue 2.24M 0.5214±0.0010 0.3434±0.0012
L1-field 1.41M 0.86420.0022 0.8542±0.0019

Run with limited GPU memory

The above commands were tested on Tesla V100 (32 GB) and Tesla T4 (15GB). If your GPU memory isn't enough for handling large graphs, try the following:

  • add --cpu-process to the command to move preprocessing logic to CPU
  • reduce evaluation batch size with --eval-batch-size. The evaluation result won't be affected since model is fixed.
  • reduce training batch with --batch-size

Run NARS with Reduced CPU Memory Footprint

As mentioned in our paper, using a lot of relation subsets may consume too much CPU memory. To reduce CPU memory footprint, we implemented an optimization in train_partial.py which trains part of our feature aggregation weights at a time.

Using OGBN-MAG dataset as an example, the following command randomly picks 3 subsets from all 8 sampled relation subsets and trains their aggregation weights every 10 epochs.

python3 train_partial.py --dataset mag --use-emb TransE_mag --R 5 \
    --use-relation-subsets sample_relation_subsets/examples/mag \
    --eval-batch-size 50000 --num-hidden 512 --lr 0.001 --batch-size 50000 \
    --dropout 0.5 --num-epochs 1000 --sample-size 3 --resample-every 10

Citation

Please cite our paper with:

@article{yu2020scalable,
    title={Scalable Graph Neural Networks for Heterogeneous Graphs},
    author={Yu, Lingfan and Shen, Jiajun and Li, Jinyang and Lerer, Adam},
    journal={arXiv preprint arXiv:2011.09679},
    year={2020}
}

License

NARS is CC-by-NC licensed, as found in the LICENSE file.

Owner
Facebook Research
Facebook Research
Pytorch implementation of our paper accepted by NeurIPS 2021 -- Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) (Link) Overview Prerequisites Linu

Shaojie Li 34 Mar 31, 2022
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 07, 2021
Ppq - A powerful offline neural network quantization tool with custimized IR

PPL Quantization Tool(PPL 量化工具) PPL Quantization Tool (PPQ) is a powerful offlin

605 Jan 03, 2023
Neural Message Passing for Computer Vision

Neural Message Passing for Quantum Chemistry Implementation of different models of Neural Networks on graphs as explained in the article proposed by G

Pau Riba 310 Nov 07, 2022
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 06, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python = 3.8 pytorch = 1.8.0

DV Lab 63 Dec 16, 2022
A Lighting Pytorch Framework for Recommendation System, Easy-to-use and Easy-to-extend.

Torch-RecHub A Lighting Pytorch Framework for Recommendation Models, Easy-to-use and Easy-to-extend. 安装 pip install torch-rechub 主要特性 scikit-learn风格易用

Mincai Lai 67 Jan 04, 2023
Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges for the practitioner

Sparse network learning with snlpy Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges

Andrew Stolman 1 Apr 30, 2021
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces

Hossam Asaad 9 Dec 15, 2022
Distance Encoding for GNN Design

Distance-encoding for GNN design This repository is the official PyTorch implementation of the DEGNN and DEAGNN framework reported in the paper: Dista

172 Nov 08, 2022
The official GitHub repository for the Argoverse 2 dataset.

Argoverse 2 API Official GitHub repository for the Argoverse 2 family of datasets. If you have any questions or run into any problems with either the

Argo AI 156 Dec 23, 2022
Deep Residual Networks with 1K Layers

Deep Residual Networks with 1K Layers By Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Microsoft Research Asia (MSRA). Table of Contents Introduc

Kaiming He 856 Jan 06, 2023
A annotation of yolov5-5.0

代码版本:0714 commit #4000 $ git clone https://github.com/ultralytics/yolov5 $ cd yolov5 $ git checkout 720aaa65c8873c0d87df09e3c1c14f3581d4ea61 这个代码只是注释版

Laughing 229 Dec 17, 2022
A python code to convert Keras pre-trained weights to Pytorch version

Weights_Keras_2_Pytorch 最近想在Pytorch项目里使用一下谷歌的NIMA,但是发现没有预训练好的pytorch权重,于是整理了一下将Keras预训练权重转为Pytorch的代码,目前是支持Keras的Conv2D, Dense, DepthwiseConv2D, Batch

Liu Hengyu 2 Dec 16, 2021
A Marvelous ChatBot implement using PyTorch.

PyTorch Marvelous ChatBot [Update] it's 2019 now, previously model can not catch up state-of-art now. So we just move towards the future a transformer

JinTian 223 Oct 18, 2022
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

Backyard Birdbot Introduction This is a silly hobby project to use existing ML models to: Detect any birds sighted by a webcam Identify whic

Chi Young Moon 71 Dec 25, 2022
Official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space

NeuralFusion This is the official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space. We provide code to train the proposed pipel

53 Jan 01, 2023
A Novel Incremental Learning Driven Instance Segmentation Framework to Recognize Highly Cluttered Instances of the Contraband Items

A Novel Incremental Learning Driven Instance Segmentation Framework to Recognize Highly Cluttered Instances of the Contraband Items This repository co

Taimur Hassan 3 Mar 16, 2022
Official implement of Paper:A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

Chenxiao Zhang 135 Dec 19, 2022