Implementation of ProteinBERT in Pytorch

Overview

ProteinBERT - Pytorch (wip)

Implementation of ProteinBERT in Pytorch.

Original Repository

Install

$ pip install protein-bert-pytorch

Usage

import torch
from protein_bert_pytorch import ProteinBERT

model = ProteinBERT(
    num_tokens = 21,
    num_annotation = 8943,
    dim = 512,
    dim_global = 256,
    depth = 6,
    narrow_conv_kernel = 9,
    wide_conv_kernel = 9,
    wide_conv_dilation = 5,
    attn_heads = 8,
    attn_dim_head = 64
)

seq = torch.randint(0, 21, (2, 2048))
mask = torch.ones(2, 2048).bool()
annotation = torch.randint(0, 1, (2, 8943)).float()

seq_logits, annotation_logits = model(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)

Citations

@article {Brandes2021.05.24.445464,
    author      = {Brandes, Nadav and Ofer, Dan and Peleg, Yam and Rappoport, Nadav and Linial, Michal},
    title       = {ProteinBERT: A universal deep-learning model of protein sequence and function},
    year        = {2021},
    doi         = {10.1101/2021.05.24.445464},
    publisher   = {Cold Spring Harbor Laboratory},
    URL         = {https://www.biorxiv.org/content/early/2021/05/25/2021.05.24.445464},
    eprint      = {https://www.biorxiv.org/content/early/2021/05/25/2021.05.24.445464.full.pdf},
    journal     = {bioRxiv}
}
You might also like...
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

A bunch of random PyTorch models using PyTorch's C++ frontend
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Comments
  • bugFix: x and y not on the same device when Learner is trained on GPU

    bugFix: x and y not on the same device when Learner is trained on GPU

    When

    seq        = torch.randint(0, 21, (2, 2048)).cuda()
    annotation = torch.randint(0, 1, (2, 8943)).float().cuda()
    mask       = torch.ones(2, 2048).bool().cuda()
    
    learner.cuda()
    
    loss = learner(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)
    
    

    OUTPUT

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-2-60892e498570> in <module>
          4 learner.cuda()
          5 
    ----> 6 loss = learner(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)
    
    ~/data/.conda/envs/torch/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
        887             result = self._slow_forward(*input, **kwargs)
        888         else:
    --> 889             result = self.forward(*input, **kwargs)
        890         for hook in itertools.chain(
        891                 _global_forward_hooks.values(),
    
    /mnt/5280b/wwang/proteinbert/protein_bert_pytorch.py in forward(self, seq, annotation, mask)
        365 
        366         for token_id in self.exclude_token_ids:
    --> 367             random_replace_token_prob_mask = random_replace_token_prob_mask & (random_tokens != token_id)  # make sure you never substitute a token with an excluded token type (pad, start, end)
        368 
        369         # noise sequence
    
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
    
    opened by wilmerwang 0
  • How to use this bert version to use the pretrianed model?

    How to use this bert version to use the pretrianed model?

    Hi guys, thanks for great work. I'm trying to use this pytorch version protein-bert to use the pre-trained model 'ftp://ftp.cs.huji.ac.il/users/nadavb/protein_bert/epoch_92400_sample_23500000.pkl', but have no clues at all. Could you please give some suggestions? Thank you so much!

    opened by Y-H-Joe 1
Owner
Phil Wang
Working with Attention
Phil Wang
[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6101 of Semester 1, AY2021-2022, starting from 08/2021. The instructors of

AccSrd 1 Sep 22, 2022
Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

stroke-predictions-ml-model machine learning model to predict individuals chance

Alex Volchek 1 Jan 03, 2022
Learning Off-Policy with Online Planning, CoRL 2021

LOOP: Learning Off-Policy with Online Planning Accepted in Conference of Robot Learning (CoRL) 2021. Harshit Sikchi, Wenxuan Zhou, David Held Paper In

Harshit Sikchi 24 Nov 22, 2022
This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

0 Feb 02, 2022
Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Explainability Requires Interactivity This repository contains the code to train all custom models used in the paper Explainability Requires Interacti

Digital Health & Machine Learning 5 Apr 07, 2022
Repository for benchmarking graph neural networks

Benchmarking Graph Neural Networks Updates Nov 2, 2020 Project based on DGL 0.4.2. See the relevant dependencies defined in the environment yml files

NTU Graph Deep Learning Lab 2k Jan 03, 2023
Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

PubTables-1M This repository contains training and evaluation code for the paper "PubTables-1M: Towards a universal dataset and metrics for training a

Microsoft 365 Jan 04, 2023
Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

DeepCDR Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network This work has been accepted to ECCB2020 and was also published in the

Qiao Liu 50 Dec 18, 2022
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

Cameron Davidson-Pilon 25.1k Jan 02, 2023
A quick recipe to learn all about Transformers

Transformers have accelerated the development of new techniques and models for natural language processing (NLP) tasks.

DAIR.AI 772 Dec 31, 2022
Pocsploit is a lightweight, flexible and novel open source poc verification framework

Pocsploit is a lightweight, flexible and novel open source poc verification framework

cckuailong 208 Dec 24, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
Simulating Sycamore quantum circuits classically using tensor network algorithm.

Simulating the Sycamore quantum supremacy circuit This repo contains data we have obtained in simulating the Sycamore quantum supremacy circuits with

Feng Pan 46 Nov 17, 2022
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor

Dave Z. Chen 79 Nov 07, 2022
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

22 Dec 08, 2022
Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Back to Event Basics: SSL of Image Reconstruction for Event Cameras Minimal code for Back to Event Basics: Self-Supervised Learning of Image Reconstru

TU Delft 42 Dec 26, 2022
For holding anime-related object classification and detection models

Animesion An end-to-end framework for anime-related object classification, detection, segmentation, and other models. Update: 01/22/2020. Due to time-

Edwin Arkel Rios 72 Nov 30, 2022
Dynamic Graph Event Detection

DyGED Dynamic Graph Event Detection Get Started pip install -r requirements.txt TODO Paper link to arxiv, and how to cite. Twitter Weather dataset tra

Mert Koşan 3 May 09, 2022
Protect against subdomain takeover

domain-protect scans Amazon Route53 across an AWS Organization for domain records vulnerable to takeover deploy to security audit account scan your en

OVO Technology 0 Nov 17, 2022
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022