TigerLily: Finding drug interactions in silico with the Graph.

Overview

PyPI Version Docs Status Code Coverage Build Status Arxiv


Drug Interaction Prediction with Tigerlily

Documentation | Example Notebook | Youtube Video | Project Report

Tigerlily is a TigerGraph based system designed to solve the drug interaction prediction task. In this machine learning task, we want to predict whether two drugs have an adverse interaction. Our framework allows us to solve this highly relevant real-world problem using graph mining techniques in these steps:


(A) Creating and populating a Graph

As a first step, the basic TigerLily tools are imported, and we load the example dataset that integrated DrugBankDDI and the BioSNAP datasets. We create a PersonalizedPageRankMachine and connect to the host with the Graph. The settings of this machine should be the appropriate user credentials and details; a secret is obtained in the TigerGraph Graph Studio. We install the default Personalized PageRank query and upload the edges of the example dataset used in our demonstrations. This graph has drug and protein nodes, drug-protein and protein-protein interactions. Our goal is to predict the drug-drug interactions.

from tigerlily.dataset import ExampleDataset
from tigerlily.embedding import EmbeddingMachine
from tigerlily.operator import hadamard_operator
from tigerlily.pagerank import PersonalizedPageRankMachine

dataset = ExampleDataset()

edges = dataset.read_edges()
target = dataset.read_target()

machine = PersonalizedPageRankMachine(host="host_name",
                                      graphname="graph_name",
                                      username="username_value",
                                      secret="secret_value",
                                      password="password_value")
                           
machine.connect()
machine.install_query()

machine.upload_graph(new_graph=True, edges=edges)

(B) Computing the Approximate Personalized PageRank vectors

We are only interested in describing the neighbourhood of drug nodes in the biological graph. Because of this, we only retrieve the neighbourhood of the drugs - for each drug we retrieve those nodes (top-k closest neighbors) which are the closest based on the Personalized PageRank scores. We are going to learn the drug embeddings based on these scores.

drug_node_ids = machine.connection.getVertices("drug")

pagerank_scores = machine.get_personalized_pagerank(drug_node_ids)

(C) Learning the Drug Embeddings and Drug Pair Feature Generation

We create an embedding machine that creates drug node representations. The embedding machine instance has a random seed, a dimensions hyperparameter (this sets the number of factors), and a maximal iteration count for the factorization. An embedding is learned from the Personalized PageRank scores and using the drug features we create drug pair features with the operator function.

embedding_machine = EmbeddingMachine(seed=42,
                                     dimensions=32,
                                     max_iter=100)

embedding = embedding_machine.fit(pagerank_scores)

drug_pair_features = embedding_machine.create_features(target, hadamard_operator)

(D) Predicting Drug Interactions and Inference

We load a gradient boosting-based classifier, an evaluation metric for binary classification, and a function to create train-test splits. We create a train and test portion of the drug pairs using 80% of the pairs for training. A gradient boosted tree model is trained, score the model on the test set. We compute an AUROC score on the test portion of the dataset and print it out.

from lightgbm import LGBMClassifier
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(drug_pair_features,
                                                    target,
                                                    train_size=0.8,
                                                    random_state=42)

model = LGBMClassifier(learning_rate=0.01,
                       n_estimators=100)

model.fit(X_train,y_train["label"])

predicted_label = model.predict_proba(X_test)

auroc_score_value = roc_auc_score(y_test["label"], predicted_label[:,1])

print(f'AUROC score: {auroc_score_value :.4f}')

Head over to the documentation to find out more about installation and a full API reference. For a quick start, check out the example notebook. If you notice anything unexpected, please open an issue.


Citing

If you find Tigerlily useful in your research, please consider adding the following citation:

@misc{tigerlily2022,
  author = {Benedek Rozemberczki},
  title = {TigerLily: Finding drug interactions in silico with the Graph},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/benedekrozemberczki/tigerlily}},
}

Installation

To install tigerlily, simply run:

pip install tigerlily

Running tests

Running tests requires that you run:

$ tox -e py

License


Credit

The TigerLily logo and the high level machine learning workflow image are based on:

Benedek Rozemberczki has a yearly subscription to the Noun Project that allows the customization and commercial use of the icons.

You might also like...
git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Self-Attention Attribution This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Intera

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Semi Hand-Object Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time (CVPR 2021).

Official code for
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

A simple API wrapper for Discord interactions.
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

Neon: an add-on for Lightbulb making it easier to handle component interactions

Neon Neon is an add-on for Lightbulb making it easier to handle component interactions. Installation pip install git+https://github.com/neonjonn/light

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

TransGanFormer (wip) Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GansFormer and TransGan paper. I

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models Description Recent research has shown that numerous human-interpretable

Releases(v0.1.0)
Owner
Benedek Rozemberczki
Machine Learning Engineer at AstraZeneca | PhD from The University of Edinburgh.
Benedek Rozemberczki
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 49 Nov 28, 2022
CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search This repository is the official implementation of CAPITAL: Optimal Subgrou

Hengrui Cai 0 Oct 19, 2021
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

CoTr: Efficient 3D Medical Image Segmentation by bridging CNN and Transformer This is the official pytorch implementation of the CoTr: Paper: CoTr: Ef

218 Dec 25, 2022
Attention-guided gan for synthesizing IR images

SI-AGAN Attention-guided gan for synthesizing IR images This repository contains the Tensorflow code for "Pedestrian Gender Recognition by Style Trans

1 Oct 25, 2021
This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

DCL-PyTorch Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page. Framework Grounding Physical

Zhenfang Chen 31 Jan 06, 2023
Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

TimeCycle Code for Learning Correspondence from the Cycle-consistency of Time (CVPR 2019, Oral). The code is developed based on the PyTorch framework,

Xiaolong Wang 706 Nov 29, 2022
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

OpenMMLab 688 Jan 06, 2023
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
Crowd-sourced Annotation of Human Motion.

Motion Annotation Tool Live: https://motion-annotation.humanoids.kit.edu Paper: The KIT Motion-Language Dataset Installation Start by installing all P

Matthias Plappert 4 May 25, 2020
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
MegEngine implementation of YOLOX

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

旷视天元 MegEngine 77 Nov 22, 2022
Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

Image Classification Project Killer in PyTorch This repo is designed for those who want to start their experiments two days before the deadline and ki

349 Dec 08, 2022
Source code for Task-Aware Variational Adversarial Active Learning

Contrastive Coding for Active Learning under Class Distribution Mismatch Official PyTorch implementation of ["Contrastive Coding for Active Learning u

27 Nov 23, 2022
[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems Introduction Multi-agent control i

VITA 6 May 05, 2022
Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"

AmbientGAN: Generative models from lossy measurements This repository provides code to reproduce results from the paper AmbientGAN: Generative models

Ashish Bora 87 Oct 19, 2022
Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Space-Time Correspondence as a Contrastive Random Walk This is the repository for Space-Time Correspondence as a Contrastive Random Walk, published at

A. Jabri 239 Dec 27, 2022
Anomaly Localization in Model Gradients Under Backdoor Attacks Against Federated Learning

Federated_Learning This repo provides a federated learning framework that allows to carry out backdoor attacks under varying conditions. This is a ker

Arçelik ARGE Açık Kaynak Yazılım Organizasyonu 0 Nov 30, 2021
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.

numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation

Ryo Yonetani 2 Jan 16, 2022
Wide Residual Networks (WideResNets) in PyTorch

Wide Residual Networks (WideResNets) in PyTorch WideResNets for CIFAR10/100 implemented in PyTorch. This implementation requires less GPU memory than

Jason Kuen 296 Dec 27, 2022