Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Last update: Dec 23, 2022

Overview

Neural Distance Embeddings for Biological Sequences

Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch. NeuroSEED is a novel framework to embed biological sequences in geometric vector spaces. Preprint will we published soon.

Overview

The repository is organised in four main folders one for each of the tasks analysed. Each of these contain scripts and models used for the task as well as instructions on how to run them and the tuned hyperparameters found.

edit_distance for the edit distance approximation task
closest_string for the closest string retrieval task
hierarchical_clustering for the hierarchical clustering task, further divided in relaxed and unsupervised for the two approaches explored
multiple_alignment for the multiple sequence alignment task, further divided in guide_tree and steiner_string
util contains a series of utility routines shared between all the tasks
tests contains a wide range of tests for the various components of the repository

Installation

Create a virtual (or conda) environment and install the dependencies:

python3 -m venv neuroseed
source neuroseed/bin/activate
pip install -r requirements.txt

Then install the mst and unionfind packages used for the hierarchical clustering:

cd hierarchical_clustering/relaxed/mst; python setup.py build_ext --inplace; cd ../../..
cd hierarchical_clustering/relaxed/unionfind; python setup.py build_ext --inplace; cd ../../..

License

MIT

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Related tags

Overview

Neural Distance Embeddings for Biological Sequences

Overview

Installation

License

Owner

Gabriele Corso

Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Code and data for ImageCoDe, a contextual vison-and-language benchmark

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

A PyTorch implementation of the continual learning experiments with deep neural networks

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

IndoNLI: A Natural Language Inference Dataset for Indonesian

A novel benchmark dataset for Monocular Layout prediction

Code and data (Incidents Dataset) for ECCV 2020 Paper "Detecting natural disasters, damage, and incidents in the wild".

Computer-Vision-Paper-Reviews - Computer Vision Paper Reviews with Key Summary along Papers & Codes

Code release of paper Improving neural implicit surfaces geometry with patch warping

nanodet_plus,yolov5_v6.0

CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

Tree LSTM implementation in PyTorch