README

Code for the paper Asymptotics of L2 Regularized Network Embeddings.

Requirements

Requires Stellargraph 1.2.1, Tensorflow 2.6.0, scikit-learm 0.24.1, tqdm, along with any other packages required for the above three packages.

Code

To run node classification or link prediction experiments, run

python -m code.train_embed [[args]]

python -m code.train_embed_link [[args]]

from the command line respectively, where [[args]] correspond to the command line arguments for each function. Note that the scripts expect to run from the parent directory of the code folder; you will need to change the import statements in the associated python files if you move them around. The -h command line argument will display the arguments (with descriptions) of each of the two files.

train_embed.py arguments

short	long	default	help
`-h`	`--help`		show this help message and exit
	`--dataset`	`Cora`	Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
	`--emb-size`	`128`	Embedding dimension. Defaults to 128.
	`--reg-weight`	`0.0`	Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
	`--norm-reg`		Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
	`--method`	`node2vec`	Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
	`--verbose`	`1`	Level of verbosity. Defaults to 1.
	`--epochs`	`5`	Number of epochs through the dataset to be used for training.
	`--optimizer`	`Adam`	Optimization algorithm to use for training.
	`--learning-rate`	`0.001`	Learning rate to use for optimization.
	`--batch-size`	`64`	Batch size used for training.
	`--train-split`	`[0.01, 0.025, 0.05]`	Percentage(s) to use for the training split when using the learned embeddings for downstream classification tasks.
	`--train-split-num`	`25`	Decides the number of random training/test splits to use for evaluating performance. Defaults to 50.
	`--output-fname`	`None`	If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
	`--node2vec-p`	`1.0`	Hyperparameter governing probability of returning to source node.
	`--node2vec-q`	`1.0`	Hyperparameter governing probability of moving to a node away from the source node.
	`--node2vec-walk-number`	`50`	Number of walks used to generate a sample for node2vec.
	`--node2vec-walk-length`	`5`	Walk length to use for node2vec.
	`--dgi-sampler`	`fullbatch`	Specifies either a fullbatch or a minibatch sampling scheme for DGI.
	`--gcn-activation`	`['relu']`	Determines the activations of each layer within a GCN. Defaults to a single layer with relu activation.
	`--graphSAGE-aggregator`	`mean`	Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
	`--graphSAGE-nbhd-sizes`	`[10, 5]`	Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [10, 5].
	`--tensorboard`		If toggles, saves Tensorboard logs for debugging purposes.
	`--visualize-embeds`	`None`	If specified with a directory, saves an image of a TSNE 2D projection of the learned embeddings at the specified directory.
	`--save-spectrum`	`None`	If specifies, saves the spectrum of the learned embeddings output by the algorithm.

train_embed_link.py arguments

short	long	default	help
`-h`	`--help`		show this help message and exit
	`--dataset`	`Cora`	Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
	`--emb-size`	`128`	Embedding dimension. Defaults to 128.
	`--reg-weight`	`0.0`	Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
	`--norm-reg`		Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
	`--method`	`node2vec`	Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
	`--verbose`	`1`	Level of verbosity. Defaults to 1.
	`--epochs`	`5`	Number of epochs through the dataset to be used for training.
	`--optimizer`	`Adam`	Optimization algorithm to use for training.
	`--learning-rate`	`0.001`	Learning rate to use for optimization.
	`--batch-size`	`64`	Batch size used for training.
	`--test-split`	`0.1`	Split of edge/non-edge set to be used for testing.
	`--output-fname`	`None`	If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
	`--node2vec-p`	`1.0`	Hyperparameter governing probability of returning to source node.
	`--node2vec-q`	`1.0`	Hyperparameter governing probability of moving to a node away from the source node.
	`--node2vec-walk-number`	`50`	Number of walks used to generate a sample for node2vec.
	`--node2vec-walk-length`	`5`	Walk length to use for node2vec.
	`--gcn-activation`	`['relu']`	Specifies layers in terms of their output activation (either relu or linear), with the number of arguments determining the length of the GCN. Defaults to a single layer with relu activation.
	`--graphSAGE-aggregator`	`mean`	Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
	`--graphSAGE-nbhd-sizes`	`[10, 5]`	Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [25, 10].

Code for the paper "Asymptotics of ℓ2 Regularized Network Embeddings"

Related tags

Overview

README

Requirements

Code

train_embed.py arguments

train_embed_link.py arguments

Owner

Andrew Davison

Fast and scalable uncertainty quantification for neural molecular property prediction, accelerated optimization, and guided virtual screening.

Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

A Fast Monotone Rotating Shallow Water model

Read and write layered TIFF ImageSourceData and ImageResources tags

Details about the wide minima density hypothesis and metrics to compute width of a minima

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Prototypical Networks for Few shot Learning in PyTorch

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

Bayesian inference for Permuton-induced Chinese Restaurant Process (NeurIPS2021).

Code for our ICCV 2021 Paper "OadTR: Online Action Detection with Transformers".

PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

SANet: A Slice-Aware Network for Pulmonary Nodule Detection

Implementation of ECCV20 paper: the devil is in classification: a simple framework for long-tail object detection and instance segmentation

Capstone-Project-2 - A game program written in the Python language

The pytorch implementation of SOKD (BMVC2021).

Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.

Refactoring dalle-pytorch and taming-transformers for TPU VM