Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Overview

Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Cashier is a tool developed by Russell Durrett for the analysis and extraction of expressed barcode tags.

This python implementation offers the same flexibility and simple command line operation.

Like it's predecessor it is a wrapper for the tools cutadapt, fastx-toolkit, and starcode.

Dependencies

  • cutadapt (sequence extraction)
  • starcode (sequence clustering)
  • fastx-toolkit (PHred score filtering)
  • pear (paired end read merging)
  • pysam (sam file convertion to fastq)

Recommended Installation Procedure

It's recommended to use conda to install and manage the dependencies for this package

conda env create -f https://raw.githubusercontent.com/brocklab/pycashier/main/environment.yml # or mamba env create -f ....
conda activate cashierenv
pycashier --help

Additionally you may install with pip. Though it will be up to you to ensure all the non-python dependencies are on the path and installed correctly.

pip install pycashier

Usage

Pycashier has one required argument which is the directory containing the fastq or sam files you wish to process.

conda activate cashierenv
pycashier ./fastqs

For additional parameters see pycashier -h.

As the files are processed two additional directories will be created pipeline and outs.

Currently all intermediary files generated as a result of the program will be found in pipeline.

While the final processed files will be found within the outs directory.

Merging Files

Pycashier can now take paired end reads and perform a merging of the reads to produce a fastq which can then be used with cashier's default feature.

pycashier ./fastqs -m

Processing Barcodes from 10X bam files

Pycashier can also extract gRNA barcodes along with 10X cell and umi barcodes.

Firstly we are only interested in the unmapped reads. From the cellranger bam output you would obtain these reads using samtools.

samtools view -f 4 possorted_genome_bam.bam > unmapped.sam

Then similar to normal barcode extraction you can pass a directory of these unmapped sam files to pycashier and extract barcodes. You can also still specify extraction parameters that will be passed to cutadapt as usual.

Note: The default parameters passed to cutadapt are unlinked adapters and minimum barcode length of 10 bp.

pycashier ./unmapped_sams -sc

When finished the outs directory will have a .tsv containing the following columns: Illumina Read Info, UMI Barcode, Cell Barcode, gRNA Barcode

Usage notes

Pycashier will NOT overwrite intermediary files. If there is an issue in the process, please delete either the pipeline directory or the requisite intermediary files for the sample you wish to reprocess. This will allow the user to place new fastqs within the source directory or a project folder without reprocessing all samples each time.

  • Currently, pycashier expects to find .fastq.gz files when merging and .fastq files when extracting barcodes. This behavior may change in the future.
  • If there are reads from multiple lanes they should first be concatenated with cat sample*R1*.fastq.gz > sample.R1.fastq.gz
  • Naming conventions:
    • Sample names are extracted from files using the first string delimited with a period. Please take this into account when naming sam or fastq files.
    • Each processing step will append information to the input file name to indicate changes, again delimited with periods.
A tool to assist in code raiding in rust

Kodelock a tool to assist in code raiding in rust This tool is designed to be used on a second monitor. This tools will allow you to see a easily read

3 Oct 27, 2022
Automated rop chain generation

This is the accompanying code to the blog post talking about automated rop chain generation. Build the test file with: make Install the dependencies:

Christopher Roberts 14 Nov 22, 2022
This is Gaurav's IP Project Completed in the year session of 2021-2022.

The Analyser by Gaurav Rayat Why this Project? Today we are continuously hearing about growth in Crime rates and the number of murders executed day by

1 Dec 30, 2021
A variant caller for the GBA gene using WGS data

Gauchian: WGS-based GBA variant caller Gauchian is a targeted variant caller for the GBA gene based on a whole-genome sequencing (WGS) BAM file. Gauch

Illumina 16 Oct 13, 2022
A Python Based Utility for Processing GST-Return JSON Files to Multiple Formats

GSTR 1/2A Utility by Shan.tk Open Source GSTR 1/GSTR 2A JSON to Excel utility based on Python. Useful for Auditors in Verifying GSTR 1 Return Invoices

Sudharshan TK 1 Oct 08, 2022
Ghost source since the developer of the project quit due to reasons

👻 Ghost Selfbot The official code for Ghost which was recently discontinued and released to the public. Feel free to use any of the code found in thi

xannyy 2 Mar 24, 2022
🍬️🦇️ Open source Trick or Treat! 🦇️🍬️

Open Source Halloween! What's an easy way to have fun, and celebrate an open source Halloween? Open source trick or treating, of course! The repositor

Research Software Engineers 3 Oct 18, 2021
Template for pre-commit hooks

Pre-commit hook template This repo is a template for a pre-commit hook. Try it out by running: pre-commit try-repo https://github.com/stefsmeets/pre-c

Stef Smeets 1 Dec 09, 2021
Simple utlity for sniffing decrypted HTTP/HTTPS traffic on a jailbroken iOS device into an HAR format.

Description iOS devices contain a hidden feature for sniffing decrypted HTTP/HTTPS traffic from all processes using the CFNetwork framework into an HA

83 Dec 25, 2022
Projeto-menu - This project is designed to learn more about control mechanisms in Python programming

Projeto-menu - This project is designed to learn more about control mechanisms in Python programming

Henrik Ricarte 2 Mar 01, 2022
A collection of tips for using MISP.

MISP Tip of the Week A collection of tips for using MISP. Published via BelgoMISP (todo) and this repository. Available in MD and JSON. Do you want to

Koen Van Impe 52 Jan 07, 2023
A small C compiler written in Python for learning purposes

A small C compiler written in Python. Generates x64 Intel-format assembly, which is then assembled and linked by nasm and ld.

Scattered Thoughts 3 Oct 22, 2021
Script to produce `.tex` files of example GAP sessions

Introduction The main file GapToTex.py in this directory is used to produce .tex files of example GAP sessions. Instructions Run python GapToTex.py [G

Friedrich Rober 2 Oct 06, 2022
A PowSyBl and Python integration based on GraalVM native image

PyPowSyBl The PyPowSyBl project gives access PowSyBl Java framework to Python developers. This Python integration relies on GraalVM to compile Java co

powsybl 23 Dec 14, 2022
Python Library to get fast extensive Dummy Data for testing

Dumda Python Library to get fast extensive Dummy Data for testing https://pypi.org/project/dumda/ Installation pip install dumda Usage: Cities from d

Oliver B. 0 Dec 27, 2021
Expression interpreter written in Python

Calc Interpreter An interpreter modeled after a calculator implemented in Python 3. The program currently only supports basic mathematical expressions

1 Oct 17, 2021
A Modern Fetch Tool for Linux!

Ufetch A Modern Fetch Tool for Linux! Programming Language: Python IDE: Visual Studio Code Developed by Avishek Dutta If you get any kind of problem,

Avishek Dutta 7 Dec 12, 2021
Small tool to use hero .json files created with Optolith for The Dark Eye/ Das Schwarze Auge 5 to perform talent probes.

DSA5-ProbeMaker A little tool for The Dark Eye 5th Edition (Das Schwarze Auge 5) to load .json from Optolith character generation and easily perform t

2 Jan 06, 2022
Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

The Astropy Project 631 Jan 05, 2023
Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord.

ProjectZomboid-ServerAssistant Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord. A Python based

3 Sep 30, 2022