Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Last update: Dec 01, 2022

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

GVP Transformer (wip) Implementation of the GVP-Transformer, which was used in the paper Learning inverse folding from millions of predicted structure

19 May 6, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

77.2k Jan 2, 2023

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

1 Nov 29, 2022

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

2 Dec 14, 2022

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

69 Dec 22, 2022

Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

159 Dec 15, 2022

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

8 Jul 9, 2021

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

24 Nov 23, 2022

Comments

protein bert uniref90 dataset
(discussed in discord)

after running the first step (create_uniref_db) of https://github.com/nadavbra/protein_bert I got a 24GB file "uniref_proteins_and_annotations.db" . It seems it could be useful for generate sequences for this project, sharing the links there

https://gitlab.com/rom1504/uniref data

colab to get the db and do a few queries https://colab.research.google.com/drive/1BGYEBDmD0yToLNou2T-t-QbJV5wCtIBz#scrollTo=21U3PpCp-pxr There are 135301051 records in the db, in a table looking like:

CREATE TABLE "protein_annotations" ( "index" INTEGER, "tax_id" REAL, "uniprot_name" TEXT, "go_annotations" TEXT, "flat_go_annotations" TEXT, "n_go_annotations" INTEGER, "complete_go_annotation_indices" TEXT, "n_complete_go_annotations" INTEGER );

Sample look like this:

| | index | tax_id | uniprot_name | go_annotations | flat_go_annotations | n_go_annotations | complete_go_annotation_indices | n_complete_go_annotations | |---:|--------:|-----------------:|:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------|-------------------:|:---------------------------------|----------------------------:| | 0 | 0 | 1.57204e+06 | A0A5A9P0L4_9TELE | {"GO Molecular Function": ["GO:0003755", "GO:0005524", "GO:0004672", "GO:0005509"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0003755", "GO:0004672", "GO:0005509", "GO:0005524"] | 4 | [2761, 3561, 4193, 4205] | 4 | | 1 | 1 | 648755 | UPI0016133188 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 2 | 2 | 1.93059e+06 | A0A410P257_9BACT | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 3 | 3 | 519421 | UPI0019403D63 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 4 | 4 | 72004 | A0A6B0RPA5_9CETA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0004672", "GO:0005524"] | 2 | [3561, 4205] | 2 | | 5 | 5 | 375764 | A0A672ZWI7_9TELE | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 6 | 6 | 1.41558e+06 | A0A6P7YNV3_9AMPH | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 | | 7 | 7 | 240159 | A0A4U5TZD8_COLLU | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0016021", "GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886", "GO:0016021"] | 4 | [3561, 4205, 4526, 10019] | 4 | | 8 | 8 | 146911 | UPI00074FFD9C | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 9 | 9 | 260995 | A0A6P8RG40_GEOSA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 |
opened by rom1504 4

Releases(0.0.36)

0.0.36(Aug 16, 2021)

Source code(tar.gz)
Source code(zip)
0.0.35(Aug 9, 2021)

Source code(tar.gz)
Source code(zip)
0.0.34(Jul 7, 2021)

Source code(tar.gz)
Source code(zip)
0.0.33(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.32(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.29(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.28a(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.27(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.26(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.25(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.24(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.23(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.21(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.20(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.19(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.18(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.17(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.16(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.14(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9a(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5a(Jun 28, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention

GitHub Repository

ICCV2021 Papers with Code

1.4k Jan 02, 2023

A list of all named GANs!

The GAN Zoo Every week, new GAN papers are coming out and it's hard to keep track of them all, not to mention the incredibly creative ways in which re

12.9k Jan 08, 2023

DeepCAD: A Deep Generative Network for Computer-Aided Design Models

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

85 Dec 31, 2022

Backdoor Attack through Frequency Domain

Backdoor Attack through Frequency Domain DEPENDENCIES python==3.8.3 numpy==1.19.4 tensorflow==2.4.0 opencv==4.5.1 idx2numpy==1.2.3 pytorch==1.7.0 Data

5 Jun 18, 2022

Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

126 Jan 06, 2023

Clustering is a popular approach to detect patterns in unlabeled data

Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data

24 Nov 11, 2022

Official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space

NeuralFusion This is the official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space. We provide code to train the proposed pipel

53 Jan 01, 2023

VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries

VACA Code repository for the paper "VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries (arXiv)". The impleme

16 Oct 10, 2022

Social Distancing Detector

Computer vision has opened up a lot of opportunities to explore into AI domain that were earlier highly limited. Here is an application of haarcascade classifier and OpenCV to develop a social distan

2 Jul 18, 2022

A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

2.6k Dec 30, 2022

PROJECT - Az Residential Real Estate Analysis

AZ RESIDENTIAL REAL ESTATE ANALYSIS -Decided on libraries to import. Includes pa

2 Jul 05, 2022

HuSpaCy: industrial-strength Hungarian natural language processing

HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti

120 Dec 14, 2022

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf

Behavior-Sequence-Transformer-Pytorch This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf This model

83 Jan 05, 2023

Tool cek opsi checkpoint facebook!

tool apa ini? cek_opsi_facebook adalah sebuah tool yang mengecek opsi checkpoint akun facebook yang terkena checkpoint! tujuan dibuatnya tool ini? too

2 Jul 17, 2022

Implementations of CNNs, RNNs, GANs, etc

Tensorflow Programs and Tutorials This repository will contain Tensorflow tutorials on a lot of the most popular deep learning concepts. It'll also co

1k Dec 30, 2022

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

GANs for Fun Created because I can! GOAL The goal of this repo is to be freely used by ML devs to check the GAN performances without coding from scrat

13 Jan 26, 2022

A Python implementation of active inference for Markov Decision Processes

A Python package for simulating Active Inference agents in Markov Decision Process environments. Please see our companion preprint on arxiv for an ove

235 Dec 21, 2022

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video 📹 Our video on Youtube and bilibili demonstrates the evaluation of

12 Dec 18, 2022

Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

81 Dec 09, 2022

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

42 Oct 07, 2022

Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Related tags

Overview

ProGen - (wip)

Install

Usage

Training from Uniref

Todo

Citations

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Generative Models for Graph-Based Protein Design

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Comments

protein bert uniref90 dataset

Releases(0.0.36)

0.0.36(Aug 16, 2021)

0.0.35(Aug 9, 2021)

0.0.34(Jul 7, 2021)

0.0.33(Jul 6, 2021)

0.0.32(Jul 6, 2021)

0.0.29(Jul 4, 2021)

0.0.28a(Jul 4, 2021)

0.0.27(Jul 4, 2021)

0.0.26(Jul 4, 2021)

0.0.25(Jul 4, 2021)

0.0.24(Jul 4, 2021)

0.0.23(Jul 4, 2021)

0.0.21(Jul 3, 2021)

0.0.20(Jul 3, 2021)

0.0.19(Jul 3, 2021)

0.0.18(Jul 2, 2021)

0.0.17(Jul 2, 2021)

0.0.16(Jul 2, 2021)

0.0.14(Jul 2, 2021)

0.0.12(Jul 2, 2021)

0.0.11(Jul 1, 2021)

0.0.10(Jul 1, 2021)

0.0.9a(Jun 30, 2021)

0.0.8(Jun 30, 2021)

0.0.7(Jun 29, 2021)

0.0.6(Jun 29, 2021)

0.0.5a(Jun 28, 2021)

0.0.5(Jun 25, 2021)

0.0.3a(Jun 25, 2021)

0.0.2a(Jun 25, 2021)