Implementation of ProteinBERT in Pytorch

Last update: Dec 25, 2022

Overview

ProteinBERT - Pytorch (wip)

Implementation of ProteinBERT in Pytorch.

Install

$ pip install protein-bert-pytorch

Usage

import torch
from protein_bert_pytorch import ProteinBERT

model = ProteinBERT(
    num_tokens = 21,
    num_annotation = 8943,
    dim = 512,
    dim_global = 256,
    depth = 6,
    narrow_conv_kernel = 9,
    wide_conv_kernel = 9,
    wide_conv_dilation = 5,
    attn_heads = 8,
    attn_dim_head = 64
)

seq = torch.randint(0, 21, (2, 2048))
mask = torch.ones(2, 2048).bool()
annotation = torch.randint(0, 1, (2, 8943)).float()

seq_logits, annotation_logits = model(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)

Citations

@article {Brandes2021.05.24.445464,
    author      = {Brandes, Nadav and Ofer, Dan and Peleg, Yam and Rappoport, Nadav and Linial, Michal},
    title       = {ProteinBERT: A universal deep-learning model of protein sequence and function},
    year        = {2021},
    doi         = {10.1101/2021.05.24.445464},
    publisher   = {Cold Spring Harbor Laboratory},
    URL         = {https://www.biorxiv.org/content/early/2021/05/25/2021.05.24.445464},
    eprint      = {https://www.biorxiv.org/content/early/2021/05/25/2021.05.24.445464.full.pdf},
    journal     = {bioRxiv}
}

You might also like...

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.09

142 Jan 6, 2023

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

605 Jan 2, 2023

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

286 Jan 2, 2023

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

67 Nov 14, 2022

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

201 Nov 9, 2022

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

775 Jan 8, 2023

PyTorch original implementation of Cross-lingual Language Model Pretraining.

XLM NEW: Added XLM-R model. PyTorch original implementation of Cross-lingual Language Model Pretraining. Includes: Monolingual language model pretrain

2.7k Dec 27, 2022

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

204 Jul 14, 2022

Comments

bugFix: x and y not on the same device when Learner is trained on GPU

When

seq        = torch.randint(0, 21, (2, 2048)).cuda()
annotation = torch.randint(0, 1, (2, 8943)).float().cuda()
mask       = torch.ones(2, 2048).bool().cuda()

learner.cuda()

loss = learner(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)

OUTPUT

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-60892e498570> in <module>
      4 learner.cuda()
      5 
----> 6 loss = learner(seq, annotation, mask = mask) # (2, 2048, 21), (2, 8943)

~/data/.conda/envs/torch/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/mnt/5280b/wwang/proteinbert/protein_bert_pytorch.py in forward(self, seq, annotation, mask)
    365 
    366         for token_id in self.exclude_token_ids:
--> 367             random_replace_token_prob_mask = random_replace_token_prob_mask & (random_tokens != token_id)  # make sure you never substitute a token with an excluded token type (pad, start, end)
    368 
    369         # noise sequence

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

opened by wilmerwang 0

How to use this bert version to use the pretrianed model?

Hi guys, thanks for great work. I'm trying to use this pytorch version protein-bert to use the pre-trained model 'ftp://ftp.cs.huji.ac.il/users/nadavb/protein_bert/epoch_92400_sample_23500000.pkl', but have no clues at all. Could you please give some suggestions? Thank you so much!

opened by Y-H-Joe 1

Implementation of ProteinBERT in Pytorch

Related tags

Overview

ProteinBERT - Pytorch (wip)

Install

Usage

Citations

You might also like...

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

PyTorch original implementation of Cross-lingual Language Model Pretraining.

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

Comments

bugFix: x and y not on the same device when Learner is trained on GPU

How to use this bert version to use the pretrianed model?

Releases(0.1.0)

0.1.0(Aug 10, 2021)

0.0.11(Aug 6, 2021)

0.0.10(Jun 11, 2021)

0.0.9(Jun 11, 2021)

0.0.8(Jun 11, 2021)

0.0.7(Jun 10, 2021)

0.0.6(May 29, 2021)

0.0.5(May 28, 2021)

0.0.4(May 28, 2021)

0.0.3a(May 28, 2021)

0.0.2(May 28, 2021)

0.0.1(May 28, 2021)

Owner

Phil Wang

Text to speech converter with GUI made in Python.

Maha is a text processing library specially developed to deal with Arabic text.

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

This repository contains examples of Task-Informed Meta-Learning

Retraining OpenAI's GPT-2 on Discord Chats

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

硕士期间自学的NLP子任务，供学习参考

TFIDF-based QA system for AIO2 competition

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Open source annotation tool for machine learning practitioners.

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.