WSDM 2022 Large-scale Temporal Graph Link Prediction - Baseline and Initial Test Set

WSDM Cup Website link

Link to this challenge

This branch offers

An initial test set having a small number of test examples for each dataset, together with their labels in exist column. Note that this test set only serves for development purposes. So
- The intermediate and final dataset will not contain the exist column.
- This is not the intermediate dataset we will be using for ranking solutions.
A simple baseline that trains on both datasets.

Download links to initial test set: Dataset A Dataset B

Baseline description

The baseline is only a minimal working example for both datasets, and it is certainly not optimal. You are encouraged to tweak it or propose your own solutions from scratch!

Here we summarize our baseline: The baseline is an RGCN-like GNN model trained on the entire graph. Event timestamps on the graph are encoded by decomposing the 10-digit decimal integers into 10-dimensional vectors, each element representing a digit. We train the model as binary classification using a negative-sampling-like strategy. Given a ground truth event (s, d, r, t) with source node s, destination node d, event type r and timestamp t, we perturb t to obtain a new value t'. We label the quadruplet with 1 if the new timestamp is larger than the original timestamp, and 0 otherwise. The model is essentially trained to predict p(t < t' | s, d, r), i.e. the probability that an edge with type r exists from source s and destination d before timestamp t'.

Baseline usage

To use the baseline you need to install DGL.

You also need at least 64GB of CPU memory. GPU is not required.

Convert csv file to DGL graph objects.

python csv2DGLgraph.py --dataset [A or B]

Training.

python base_pipeline.py --dataset [A or B]

Performance on Initial Test Set

The baseline got AUC of 0.511 on Dataset A and 0.510 on Dataset B.

WSDM2022 Challenge - Large scale temporal graph link prediction

Related tags

Overview

WSDM 2022 Large-scale Temporal Graph Link Prediction - Baseline and Initial Test Set

Baseline description

Baseline usage

Performance on Initial Test Set

Owner

Deep Graph Library

An Straight Dilated Network with Wavelet for image Deblurring

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Edison AT is software Depression Assistant personal.

Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

StyleMapGAN - Official PyTorch Implementation

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

This is an example of a reproducible modelling project

A set of tests for evaluating large-scale algorithms for Wasserstein-2 transport maps computation.

10x faster matrix and vector operations

Provide baselines and evaluation metrics of the task: traffic flow prediction

This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.

Object Detection Projekt in GKI WS2021/22

Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

Log4j JNDI inj. vuln scanner

Intent parsing and slot filling in PyTorch with seq2seq + attention

Preprossing-loan-data-with-NumPy - In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United States.

《Dual-Resolution Correspondence Network》(NeurIPS 2020)

'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Honours project, on creating a depth estimation map from two stereo images of featureless regions