Edge-Augmented Graph Transformer

Overview

PWCPWCPWCPWCPWC

Edge-augmented Graph Transformer

Introduction

This is the official implementation of the Edge-augmented Graph Transformer (EGT) as described in https://arxiv.org/abs/2108.03348, which augments the Transformer architecture with residual edge channels. The resultant architecture can directly process graph-structured data and acheives good results on supervised graph-learning tasks as presented by Dwivedi et al.. It also achieves good performance on the large-scale PCQM4M-LSC (0.1263 MAE on val) dataset. EGT beats convolutional/message-passing graph neural networks on a wide range of supervised tasks and thus demonstrates that convolutional aggregation is not an essential inductive bias for graphs.

Requirements

  • python >= 3.7
  • tensorflow >= 2.1.0
  • h5py >= 2.8.0
  • numpy >= 1.18.4
  • scikit-learn >= 0.22.1

Download the Datasets

For our experiments, we converted the datasets to HDF5 format for the convenience of using them without any specific library. Only the h5py library is required. The datasets can be downloaded from -

Or you can simply run the provided bash scripts download_medium_scale_datasets.sh, download_large_scale_datasets.sh. The default location of the datasets is the datasets directory.

Run Training and Evaluations

You must create a JSON config file containing the configuration of a model, its training and evaluation configs (configurations). The same config file is used to do both training and evaluations.

  • To run training: python run_training.py <config_file.json>
  • To end training (prematurely): python end_training.py <config_file.json>
  • To perform evaluations: python do_evaluations.py <config_file.json>

Config files for the main results presented in the paper are contained in the configs/main directory, whereas configurations for the ablation study are contained in the configs/ablation directory. The paths and names of the files are self-explanatory.

More About Training and Evaluations

Once the training is started a model folder will be created in the models directory, under the specified dataset name. This folder will contain a copy of the input config file, for the convenience of resuming training/evaluation. Also, it will contain a config.json which will contain all configs, including unspecified default values, used for the training. Training will be checkpointed per epoch. In case of any interruption you can resume training by running the run_training.py with the config.json file again.

In case you wish to finalize training midway, just stop training and run end_training.py script with the config.json file to save the model weights.

After training, you can run the do_evaluations.py script with the same config file to perform evaluations. Alongside being printed to stdout, results will be saved in the predictions directory, under the model directory.

Config File

The config file can contain many different configurations, however, the only required configuration is scheme, which specifies the training scheme. If the other configurations are not specified, a default value will be assumed for them. Here are some of the commonly used configurations:

scheme: Used to specify the training scheme. It has a format <dataset_name>.<positional_encoding>. For example: cifar10.svd or zinc.eig. If no encoding is to be used it can be something like pcqm4m.mat. For a full list you can explore the lib/training/schemes directory.

dataset_path: If the datasets are contained in the default location in the datasets directory, this config need not be specified. Otherwise you have to point it towards the <dataset_name>.h5 file.

model_name: Serves as an identifier for the model, also specifies default path of the model directory, weight files etc.

save_path: The training process will create a model directory containing the logs, checkpoints, configs, model summary and predictions/evaluations. By default it creates a folder at models/<dataset_name>/<model_name> but it can be changed via this config.

cache_dir: During first time of training/evaluation the data will be cached to a tensorflow cache format. Default path is data_cache/<dataset_name>/<positional_encoding>. But it can be changed via this config.

distributed: In a multi-gpu setting you can set it to True, for distributed training.

batch_size: Batch size.

num_epochs: Maximum Number of epochs.

initial_lr: Initial learning rate. In case of warmup it is the maximum learning rate.

rlr_factor: Reduce LR on plateau factor. Setting it to a value >= 1.0 turns off Reduce LR.

rlr_patience: Reduce LR patience, i.e. the number of epochs after which LR is reduced if validation loss doesn't improve.

min_lr_factor: The factor by which the minimum LR is smaller, of the initial LR. Default is 0.01.

model_height: The number of layers L.

model_width: The dimensionality of the node channels d_h.

edge_width: The dimensionality of the edge channels d_e.

num_heads: The number of attention heads. Default is 8.

ffn_multiplier: FFN multiplier for both channels. Default is 2.0 .

virtual_nodes: number of virtual nodes. 0 (default) would result in global average pooling being used instead of virtual nodes.

upto_hop: Clipping value of the input distance matrix. A value of 1 (default) would result in adjacency matrix being used as input structural matrix.

mlp_layers: Dimensionality of the final MLP layers, specified as a list of factors with respect to d_h. Default is [0.5, 0.25].

gate_attention: Set this to False to get the ungated EGT variant (EGT-U).

dropout: Dropout rate for both channels. Default is 0.

edge_dropout: If specified, applies a different dropout rate to the edge channels.

edge_channel_type: Used to create ablated variants of EGT. A value of "residual" (default) implies pure/full EGT. "constrained" implies EGT-constrained. "bias" implies EGT-simple.

warmup_steps: If specified, performs a linear learning rate warmup for the specified number of gradient update steps.

total_steps: If specified, performs a cosine annealing after warmup, so that the model is trained for the specified number of steps.

[For SVD-based encodings]:

use_svd: Turning this off (False) would result in no positional encoding being used.

sel_svd_features: Rank of the SVD encodings r.

random_neg: Augment SVD encodings by random negation.

[For Eigenvectors encodings]:

use_eig: Turning this off (False) would result in no positional encoding being used.

sel_eig_features: Number of eigen vectors.

[For Distance prediction Objective (DO)]:

distance_target: Predict distance up to the specified hop, nu.

distance_loss: Factor by which to multiply the distance prediction loss, kappa.

Creation of the HDF5 Datasets from Scratch

We included two Jupyter notebooks to demonstrate how the HDF5 datasets are created

  • For the medium scale datasets view create_hdf_benchmarking_datasets.ipynb. You will need pytorch, ogb==1.1.1 and dgl==0.4.2 libraries to run the notebook. The notebook is also runnable on Google Colaboratory.
  • For the large scale pcqm4m dataset view create_hdf_pcqm4m.ipynb. You will need pytorch, ogb>=1.3.0 and rdkit>=2019.03.1 to run the notebook.

Python Environment

The Anaconda environment in which our experiments were conducted is specified in the environment.yml file.

Citation

Please cite the following paper if you find the code useful:

@article{hussain2021edge,
  title={Edge-augmented Graph Transformers: Global Self-attention is Enough for Graphs},
  author={Hussain, Md Shamim and Zaki, Mohammed J and Subramanian, Dharmashankar},
  journal={arXiv preprint arXiv:2108.03348},
  year={2021}
}
Owner
Md Shamim Hussain
Md Shamim Hussain is a Ph.D. student in Computer Science at Rensselaer Polytechnic Institute, NY. He got his B.Sc. and M.Sc. in EEE from BUET, Dhaka.
Md Shamim Hussain
Model for recasing and repunctuating ASR transcripts

Recasing and punctuation model based on Bert Benoit Favre 2021 This system converts a sequence of lowercase tokens without punctuation to a sequence o

Benoit Favre 88 Dec 29, 2022
PUA Programming Language written in Python.

pua-lang PUA Programming Language written in Python. Installation git clone https://github.com/zhaoyang97/pua-lang.git cd pua-lang pip install . Try

zy 4 Feb 19, 2022
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Dec 28, 2022
Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

SunLu Z 7 Nov 11, 2022
We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.

Voice Based Personal Assistant We have built a Voice based Personal Assistant for people to access files hands free in their device using natural lang

Rushabh 2 Nov 13, 2021
Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

Negative Sampling for NER Unlabeled entity problem is prevalent in many NER scenarios (e.g., weakly supervised NER). Our paper in ICLR-2021 proposes u

Yangming Li 128 Dec 29, 2022
RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

Stefan Dumitrescu 9 Nov 07, 2022
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

303 Dec 17, 2022
TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

TweebankNLP This repo contains the new Tweebank-NER dataset and off-the-shelf Twitter-Stanza pipeline for state-of-the-art Tweet NLP, as described in

Laboratory for Social Machines 84 Dec 20, 2022
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

deepset 6.4k Jan 09, 2023
Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Google 6.4k Jan 01, 2023
The training code for the 4th place model at MDX 2021 leaderboard A.

The training code for the 4th place model at MDX 2021 leaderboard A.

Chin-Yun Yu 32 Dec 18, 2022
The source code of HeCo

HeCo This repo is for source code of KDD 2021 paper "Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning". Paper Link: htt

Nian Liu 106 Dec 27, 2022
This repository has a implementations of data augmentation for NLP for Japanese.

daaja This repository has a implementations of data augmentation for NLP for Japanese: EDA: Easy Data Augmentation Techniques for Boosting Performance

Koga Kobayashi 60 Nov 11, 2022
Text-Based zombie apocalyptic decision-making game in Python

Inspiration We shared university first year game coursework.[to gauge previous experience and start brainstorming] Adapted a particular nuclear fallou

Amin Sabbagh 2 Feb 17, 2022
Voice Assistant inspired by Google Assistant, Cortana, Alexa, Siri, ...

author: @shival_gupta VoiceAI This program is an example of a simple virtual assitant It will listen to you and do accordingly It will begin with wish

Shival Gupta 1 Jan 06, 2022
Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

Stanford NLP 8.8k Jan 07, 2023
뉴스 도메인 질의응답 시스템 (21-1학기 졸업 프로젝트)

뉴스 도메인 질의응답 시스템 본 프로젝트는 뉴스기사에 대한 질의응답 서비스 를 제공하기 위해서 진행한 프로젝트입니다. 약 3개월간 ( 21. 03 ~ 21. 05 ) 진행하였으며 Transformer 아키텍쳐 기반의 Encoder를 사용하여 한국어 질의응답 데이터셋으로

TaegyeongEo 4 Jul 08, 2022
DAGAN - Dual Attention GANs for Semantic Image Synthesis

Contents Semantic Image Synthesis with DAGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evalu

Hao Tang 104 Oct 08, 2022
Text to speech converter with GUI made in Python.

Text-to-speech-with-GUI Text to speech converter with GUI made in Python. To run this download the zip file and run the main file or clone this repo.

SidTheMiner 1 Nov 15, 2021