Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Last update: Feb 11, 2022

Related tags

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

The scripts in this repository are roughly analogous to the provided MULTI-seq R package deMULTIplex at https://github.com/chris-mcginnis-ucsf/MULTI-seq. This script loads read data from paired end reads, performs fuzzy string matching from paired end reads to the provided MULTIseq barcode file, then counts the reads mapping to each barcode. Next, Expectation Maximization is used to fit Gaussian Mixture Models for each barcode, which assigns each cell a most likely barcode, no barcode or doublet barcodes.

Installation

Clone this repository. The scripts within also depend on python >= 3.7 and the following packages which can be installed with: pip install pandas numpy scipy fuzzywuzzy tqdm sparse_dot_topn scanpy natsort

You will need the cellranger cell barcodes file before running. You can in theory modify the MultiseqIndices.txt along with the read length parameters for custom barcodes in the reads.

Usage example for 10X scRNAseq or Multiome + MULTIseq:

python BarcodeFuzzyMatching.py /path/to/this/repo/MultiseqSamplesExample.txt /path/to/this/repo/MultiseqIndices.txt /path/to/sampleMULTIseq_R1.fastq /path/to/cellranger/outs/filtered_feature_bc_matrix/barcodes.tsv.gz /path/to/output/dir/ 16 8 0

python RunDemuxEM.py /path/to/output/dir/ /path/to/cellranger/outs/filtered_feature_bc_matrix/

Running this pipeline will output a matrix of barcodes by reads_counts, as well as a csv listing cell barcodes and their assigned barcode(s).

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Related tags

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

Installation

Usage example for 10X scRNAseq or Multiome + MULTIseq:

Owner

MT Schmitz

Rust bindings for the C++ api of PyTorch.

A framework for joint super-resolution and image synthesis, without requiring real training data

Minimal PyTorch implementation of YOLOv3

Generative code template for PixelBeasts 10k NFT project.

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

A collection of scripts I developed for personal and working projects.

A collection of semantic image segmentation models implemented in TensorFlow

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Music source separation is a task to separate audio recordings into individual sources

The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

quantize aware training package for NCNN on pytorch

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

RoMA: Robust Model Adaptation for Offline Model-based Optimization

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

DrQ-v2: Improved Data-Augmented Reinforcement Learning

This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection