Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Last update: Nov 23, 2022

Related tags

Overview

pae_to_domains

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Overview

Using a predicted aligned error matrix corresponding to an AlphaFold2 model (e.g. as downloaded from https://alphafold.ebi.ac.uk/), returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

Requirements

Python >=3.7
NetworkX >= 2.6.2

Known Issues

Due to an internal implementation issue in NetworkX (Issue #4992) some combinations of PAE matrix and resolution can lead to a KeyError. Solutions to this are being explored, and it will hopefully be fixed in the next NetworkX release.

Usage

While primarily intended as a code snippet to be incorporated into larger projects, this can also be called from the command line. At its simplest:

python pae_to_domains.py pae_file.json

... will yield a .csv file with each line providing the indices for one residue cluster. Full help for the command-line version:

positional arguments:
  pae_file              Name of the PAE JSON file.

optional arguments:
  -h, --help            show this help message and exit
  --output_file OUTPUT_FILE
                        Name of output file (comma-delimited text format.
                        Default: clusters.csv
  --pae_power PAE_POWER
                        Graph edges will be weighted as 1/pae**pae_power.
                        Default: 1.0
  --pae_cutoff PAE_CUTOFF
                        Graph edges will only be created for residue pairs
                        with pae



Example
Using https://alphafold.ebi.ac.uk/entry/Q9HBA0 as an example case...
resolution=0.5: 
resolution=1.0: 
resolution=2.0:

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Related tags

Overview

pae_to_domains

Overview

Requirements

Known Issues

Usage

Example

Owner

Tristan Croll

Plover-tapey-tape: an alternative to Plover’s built-in paper tape

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

Efficient Lottery Ticket Finding: Less Data is More

[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

Unofficial Implementation of MLP-Mixer, Image Classification Model

Birthday-problem - The birthday problem asks for the probability that, in a set of n randomly chosen people, at least two will share a birthday

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Python and Julia in harmony.

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Point Cloud Registration using Representative Overlapping Points.

A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

Code for Deep Single-image Portrait Image Relighting

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Download & Install mods for your favorit game with a few simple clicks

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

Facial detection, landmark tracking and expression transfer library for Windows, Linux and Mac

Feup-csr - Repository holding my group's submission to the CSR project competition