[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Overview

Efficient Graph Similarity Computation - (EGSC)

This repo contains the source code and dataset for our paper:

Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation
2021 Conference on Neural Information Processing Systems (NeurIPS 2021)
[Paper]

@inproceedings{qin2021slow,
              title={Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation},
              author={Qin, Can and Zhao, Handong and Wang, Lichen and Wang, Huan and Zhang, Yulun and Fu, Yun},
              booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
              year={2021}
            }
    

EGSC Illustration of knowledge distillation to achieve a fast model given a early-fusion model. Such the fast/individual model enables the online inference.

Introduction

Graph Similarity Computation (GSC) is essential to wide-ranging graph appli- cations such as retrieval, plagiarism/anomaly detection, etc. The exact computation of graph similarity, e.g., Graph Edit Distance (GED), is an NP-hard problem that cannot be exactly solved within an adequate time given large graphs. Thanks to the strong representation power of graph neural network (GNN), a variety of GNN-based inexact methods emerged. To capture the subtle difference across graphs, the key success is designing the dense interaction with features fusion at the early stage, which, however, is a trade-off between speed and accuracy. For Slow Learning of graph similarity, this paper proposes a novel early-fusion approach by designing a co-attention-based feature fusion network on multilevel GNN features. To further improve the speed without much accuracy drop, we introduce an efficient GSC solution by distilling the knowledge from the slow early-fusion model to the student one for Fast Inference. Such a student model also enables the offline collection of individual graph embeddings, speeding up the inference time in orders. To address the instability through knowledge transfer, we decompose the dynamic joint embedding into the static pseudo individual ones for precise teacher-student alignment. The experimental analysis on the real-world datasets demonstrates the superiority of our approach over the state-of-the-art methods on both accuracy and efficiency. Particularly, we speed up the prior art by more than 10x on the benchmark AIDS data.

Dataset

We have used the standard dataloader, i.e., ‘GEDDataset’, directly provided in the PyG, whose downloading link can be referred below.

AIDS700nef

LINUX

ALKANE

IMDBMulti

The code takes pairs of graphs for training from an input folder where each pair of graph is stored as a JSON. Pairs of graphs used for testing are also stored as JSON files. Every node id and node label has to be indexed from 0. Keys of dictionaries are stored strings in order to make JSON serialization possible.

Every JSON file has the following key-value structure:

{"graph_1": [[0, 1], [1, 2], [2, 3], [3, 4]],
 "graph_2":  [[0, 1], [1, 2], [1, 3], [3, 4], [2, 4]],
 "labels_1": [2, 2, 2, 2],
 "labels_2": [2, 3, 2, 2, 2],
 "ged": 1}

The **graph_1** and **graph_2** keys have edge list values which descibe the connectivity structure. Similarly, the **labels_1** and **labels_2** keys have labels for each node which are stored as list - positions in the list correspond to node identifiers. The **ged** key has an integer value which is the raw graph edit distance for the pair of graphs.

Requirements

The codebase is implemented in Python 3.6.12. package versions used for development are just below.

matplotlib        3.3.4
networkx          2.4
numpy             1.19.5
pandas            1.1.2
scikit-learn      0.23.2
scipy             1.4.1
texttable         1.6.3
torch             1.6.0
torch-cluster     1.5.9
torch-geometric   1.7.0
torch-scatter     2.0.6
torch-sparse      0.6.9
tqdm              4.60.0

The installation of pytorch-geometric (PyG) please refers to its official tutorial.

File Structure

.
├── README.md
├── LICENSE                            
├── EGSC-T
│   ├── src
│   │    ├── egsc.py 
│   │    ├── layers.py
│   │    ├── main.py
│   │    ├── parser.py        
│   │    └── utils.py                             
│   ├── README.md                      
│   └── train.sh                        
└── GSC_datasets
    ├── AIDS700nef
    ├── ALKANE
    ├── IMDBMulti
    └── LINUX

To Do

- [x] GED Datasets Processing
- [x] Teacher Model Training
- [ ] Student Model Training
- [ ] Knowledge Distillation
- [ ] Online Inference

The remaining implementations are coming soon.

Acknowledgement

We would like to thank the SimGNN and Extended-SimGNN which we used for this implementation.

Owner
PhD student in Northeastern University, Boston, USA
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer This repository contains the PyTorch code for Evo-ViT. This work proposes a slow-fas

YifanXu 53 Dec 05, 2022
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
Final project code: Implementing BicycleGAN, for CIS680 FA21 at University of Pennsylvania

680 Final Project: BicycleGAN Haoran Tang Instructions 1. Training To train the network, please run train.py. Change hyper-parameters and folder paths

Haoran Tang 0 Apr 22, 2022
A NSFW content filter.

Project_Nfilter A NSFW content filter. With a motive of minimizing the spreads and leakage of NSFW contents on internet and access to others devices ,

1 Jan 20, 2022
BlueFog Tutorials

BlueFog Tutorials Welcome to the BlueFog tutorials! In this repository, we've put together a collection of awesome Jupyter notebooks. These notebooks

4 Oct 27, 2021
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intenti

NVIDIA Corporation 6.9k Jan 03, 2023
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
Example repository for custom C++/CUDA operators for TorchScript

Custom TorchScript Operators Example This repository contains examples for writing, compiling and using custom TorchScript operators. See here for the

106 Dec 14, 2022
A Tensorfflow implementation of Attend, Infer, Repeat

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models This is an unofficial Tensorflow implementation of Attend, Infear, Repeat (AIR)

Adam Kosiorek 82 May 27, 2022
Quantum-enhanced transformer neural network

Example of a Quantum-enhanced transformer neural network Get the code: git clone https://github.com/rdisipio/qtransformer.git cd qtransformer Create

Riccardo Di Sipio 61 Nov 08, 2022
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Yunjey Choi 5.1k Dec 30, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
PyTorch implementation of Rethinking Positional Encoding in Language Pre-training

TUPE PyTorch implementation of Rethinking Positional Encoding in Language Pre-training. Quickstart Clone this repository. git clone https://github.com

Jake Tae 5 Jan 27, 2022
Gradient Step Denoiser for convergent Plug-and-Play

Source code for the paper "Gradient Step Denoiser for convergent Plug-and-Play"

Samuel Hurault 11 Sep 17, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 21k Jan 06, 2023
Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Uncertainty-based OOD detection Exploring the link between uncertainty estimates obtained by "exact" Bayesian inference and out-of-distribution (OOD)

Christian Henning 1 Nov 05, 2022
Few-shot NLP benchmark for unified, rigorous eval

FLEX FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables: First-class NLP support Support for meta-training

AI2 85 Dec 03, 2022
📖 Deep Attentional Guided Image Filtering

📖 Deep Attentional Guided Image Filtering [Paper] Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao ,Xiangyang Ji Harbin Institute of Technology,

9 Dec 23, 2022
Quick program made to generate alpha and delta tables for Hidden Markov Models

HMM_Calc Functions for generating Alpha and Delta tables from a Hidden Markov Model. Parameters: a: Matrix of transition probabilities. a[i][j] = a_{i

Adem Odza 1 Dec 04, 2021