A variant caller for the GBA gene using WGS data

Related tags

MiscellaneousGauchian
Overview

Gauchian: WGS-based GBA variant caller

Gauchian is a targeted variant caller for the GBA gene based on a whole-genome sequencing (WGS) BAM file. Gauchian uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog GBAP1 and is able to detect variants accurately in the Exons 9-11 homology region, such as large deletions or duplications between GBA and GBAP1, and GBAP1-like variants in GBA, including p.A495P, p.L483P, p.D448H, c.1263del, RecNciI, RecTL and c.1263del+RecTL. In addition to these challenging variants, Gauchian also calls known pathogenic or likely pathogenic GBA variants classified in ClinVar. Please refer to our preprint for more details about the method.

Running the program

This Python3 program can be run as follows:

python -m gauchian --manifest MANIFEST_FILE \
                   --genome [19/37/38] \
                   --prefix OUTPUT_FILE_PREFIX \
                   --outDir OUTPUT_DIRECTORY \
                   --threads NUMBER_THREADS

The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file. For CRAM input, it’s suggested to provide the path to the reference fasta file with --reference in the command.

Interpreting the output

The program produces a .tsv file in the directory specified by --outDir. The fields are explained below:

Fields in tsv Explanation
Sample Sample name
is_biallelic_GBAP1-like_variant_exon9-11 Whether the sample is called as biallelic for GBAP1-like variants in exon9-11
is_carrier_GBAP1-like_variant_exon9-11 Whether the sample is called as a carrier for GBAP1-like variants in exon9-11
total_CN Total copy number of GBA+GBAP1
deletion_breakpoint_in_GBA_gene Whether the deletion breakpoint is in GBA gene if a deletion exists
GBAP1-like_variant_exon9-11 GBAP1-like variants called in exon9-11, two alleles separated by /
other_variants Other variants called (non-GBAP1-like variants or variants outside of exon9-11)

A .json file is also produced that contains more information about each sample.

Fields in json Explanation
Coverage_MAD Median absolute deviation of depth, measure of sample quality
Median_depth Sample median depth
deletion_CN CN of the unique region between GBA and GBAP1. This value plus 2 is the total CN
deletion_CN_raw Raw normalized depth of the unique region between GBA and GBAP1
variant_raw_count Supporting reads for each variant
snp_call GBA copy number call at GBA/GBAP1 differentiating sites
snp_raw Raw GBA copy number at GBA/GBAP1 differentiating sites
haplotypes Summary of haplotypes assembled across GBA/GBAP1 differentiating sites in Exon9-11
You might also like...
Data Structures and Algorithms Python - Practice data structures and algorithms in python with few small projects

Data Structures and Algorithms All the essential resources and template code nee

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets.

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps find essential insights from training results and improves AI performance.

Open-source data observability for modern data teams
Open-source data observability for modern data teams

Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati

A demo of a data science project using Kedro

iris Overview This is your new Kedro project, which was generated using Kedro 0.17.4. Take a look at the Kedro documentation to get started. Rules and

Data Poisoning based on Adversarial Attacks using Non-Robust Features

Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER

Cisco IOS-XE Operations Program. Shows operational data using restconf and yang
Cisco IOS-XE Operations Program. Shows operational data using restconf and yang

XE-Ops View operational and config data from devices running Cisco IOS-XE software. NoteS The build folder is the latest build. All other files are fo

Run python scripts and pass data between multiple python and node processes using this npm module

Run python scripts and pass data between multiple python and node processes using this npm module. process-communication has a event based architecture for interacting with python data and errors inside nodejs.

ARRU seismic backprojection - Earthquake waveform detection and P/S arrivals picking on continuous data using ARRU phase picker Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

Comments
  • UserWarning: multiple_iterators not implemented for CRAM

    UserWarning: multiple_iterators not implemented for CRAM

    When running with .cram file, got the following warnings /gauchian/depth_calling/snp_count.py:131: UserWarning: multiple_iterators not implemented for CRAM ignore_orphan=False /gauchian/depth_calling/haplotype.py:189: UserWarning: multiple_iterators not implemented for CRAM min_base_quality=13

    Will these warnings affect the quality of calls?

    opened by LNGDingj 1
Releases(v1.0.2)
Owner
Illumina
Illumina Open Source Software
Illumina
Block fingerprinting for the beacon chain, for client identification & client diversity metrics

blockprint This is a repository for discussion and development of tools for Ethereum block fingerprinting. The primary aim is to measure beacon chain

Sigma Prime 49 Dec 08, 2022
End-to-End text sumarization, QAs generation using flask.

Help-Me-Read A web application created with Flask + BootStrap + HuggingFace 🤗 to generate summary and question-answer from given input text. It uses

Ankush Kuwar 12 Nov 13, 2022
Morth - Stack Based Programming Language

Morth WARNING! THIS LANGUAGE IS A WORKING PROGRESS. THIS IS JUST A HOBBY PROJECT

Dominik Danner 2 Mar 05, 2022
CarolinaCon CTF Online

CarolinaCon Online CTF CTF challenges from CarolinaCon Online April 23 through April 25, 2021. All challenges from the CTF will eventually be here. Co

49th Security Division 6 May 04, 2022
BasicVSR++ function for VapourSynth

BasicVSR++ BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment Ported from https://github.com/open-mmlab/mmediting De

Holy Wu 34 Nov 28, 2022
Modern API wrapper for Genshin Impact built on asyncio and pydantic.

genshin.py Modern API wrapper for Genshin Impact built on asyncio and pydantic.

sadru 212 Jan 06, 2023
:snake: Complete C99 parser in pure Python

pycparser v2.20 Contents 1 Introduction 1.1 What is pycparser? 1.2 What is it good for? 1.3 Which version of C does pycparser support? 1.4 What gramma

Eli Bendersky 2.8k Dec 29, 2022
AMTIO aka All My Tools in One

AMTIO AMTIO aka All My Tools In One. I plan to put a bunch of my tools in this one repo since im too lazy to make one big tool. Installation git clone

osintcat 3 Jul 29, 2021
IPO Checker for NEPSE

IPO Checker Checks more than one account for an IPO. Usage: ipo_checker.py [-h] --file FILE IPO Checker for a list. optional arguments: -h, --help

Sagar Tamang 4 Sep 20, 2022
Sequence clustering and database creation using mmseqs, from local fasta files

Sequence clustering and database creation using mmseqs, from local fasta files

Ana Julia Velez Rueda 3 Oct 27, 2022
msgqywx 使用企业微信的应用消息推送实时信息

msgqywx 使用企业微信的应用消息推送实时信息

Demon Finch 8 Dec 18, 2022
Transform your boring distro into a hacking powerhouse.

Pentizer Transform your boring distro into a hacking powerhouse. Pentizer is a personal project that imports Kali and Parrot repositories in any Debia

Michail Tsimpliarakis 2 Nov 05, 2021
Reload all Blender add-on modules

Reload-Addon This add-on creates a list of the modules that the add-on selected in the drop-down menu contains and reloads them with the keyboard shor

2 Dec 02, 2021
CupScript is a simple programing language made with python

CupScript CupScript is a simple programming language made with python It includes some basic functions, variables, loops, and some other built in func

FUSEN 23 Dec 29, 2022
A tool converting rpk (记乎) to apkg (Anki Package)

RpkConverter This tool is used to convert rpk file to Anki apkg. 如果遇到任何问题,请发起issue,并描述情况。如果转换rpk出现问题,请将文件发到邮箱 ssqyang [AT] outlook.com,我会debug并修复问题。 下

9 Nov 01, 2021
a simple proof system I made to learn math without any mistakes

math_up a simple proof system I made to learn math without any mistakes 0. Short Introduction test yourself, enjoy your math! math_up is an NBG-based,

양현우 5 Jun 04, 2021
Funchacks - Fun module which is a small set of utilities

funchacks 👋 Introduction Funchacks is a fun module that provides a small packag

DenyS 6 Aug 04, 2022
Port of the OpenCascade library to JavaScript / WebAssembly using Emscripten

OpenCascade.js A port of the OpenCascade CAD library to JavaScript and WebAssembly via Emscripten. Explore the docs » Examples · Issues · Discuss Proj

Sebastian Alff 347 Jan 08, 2023
Some Python scripts that fx(hash) users might find useful.

fx_hash_utils Some Python scripts that fx(hash) users might find useful. get_images This script downloads all the static images of the tokens generate

30 Oct 05, 2022
Experiments with Tox plugin system

The project is an attempt to add to the tox some missing out of the box functionality. Basically it is just an extension for the tool that will be loa

Volodymyr Vitvitskyi 30 Nov 26, 2022