A variant caller for the GBA gene using WGS data

Related tags

MiscellaneousGauchian
Overview

Gauchian: WGS-based GBA variant caller

Gauchian is a targeted variant caller for the GBA gene based on a whole-genome sequencing (WGS) BAM file. Gauchian uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog GBAP1 and is able to detect variants accurately in the Exons 9-11 homology region, such as large deletions or duplications between GBA and GBAP1, and GBAP1-like variants in GBA, including p.A495P, p.L483P, p.D448H, c.1263del, RecNciI, RecTL and c.1263del+RecTL. In addition to these challenging variants, Gauchian also calls known pathogenic or likely pathogenic GBA variants classified in ClinVar. Please refer to our preprint for more details about the method.

Running the program

This Python3 program can be run as follows:

python -m gauchian --manifest MANIFEST_FILE \
                   --genome [19/37/38] \
                   --prefix OUTPUT_FILE_PREFIX \
                   --outDir OUTPUT_DIRECTORY \
                   --threads NUMBER_THREADS

The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file. For CRAM input, it’s suggested to provide the path to the reference fasta file with --reference in the command.

Interpreting the output

The program produces a .tsv file in the directory specified by --outDir. The fields are explained below:

Fields in tsv Explanation
Sample Sample name
is_biallelic_GBAP1-like_variant_exon9-11 Whether the sample is called as biallelic for GBAP1-like variants in exon9-11
is_carrier_GBAP1-like_variant_exon9-11 Whether the sample is called as a carrier for GBAP1-like variants in exon9-11
total_CN Total copy number of GBA+GBAP1
deletion_breakpoint_in_GBA_gene Whether the deletion breakpoint is in GBA gene if a deletion exists
GBAP1-like_variant_exon9-11 GBAP1-like variants called in exon9-11, two alleles separated by /
other_variants Other variants called (non-GBAP1-like variants or variants outside of exon9-11)

A .json file is also produced that contains more information about each sample.

Fields in json Explanation
Coverage_MAD Median absolute deviation of depth, measure of sample quality
Median_depth Sample median depth
deletion_CN CN of the unique region between GBA and GBAP1. This value plus 2 is the total CN
deletion_CN_raw Raw normalized depth of the unique region between GBA and GBAP1
variant_raw_count Supporting reads for each variant
snp_call GBA copy number call at GBA/GBAP1 differentiating sites
snp_raw Raw GBA copy number at GBA/GBAP1 differentiating sites
haplotypes Summary of haplotypes assembled across GBA/GBAP1 differentiating sites in Exon9-11
You might also like...
Data Structures and Algorithms Python - Practice data structures and algorithms in python with few small projects

Data Structures and Algorithms All the essential resources and template code nee

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets.

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps find essential insights from training results and improves AI performance.

Open-source data observability for modern data teams
Open-source data observability for modern data teams

Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati

A demo of a data science project using Kedro

iris Overview This is your new Kedro project, which was generated using Kedro 0.17.4. Take a look at the Kedro documentation to get started. Rules and

Data Poisoning based on Adversarial Attacks using Non-Robust Features

Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER

Cisco IOS-XE Operations Program. Shows operational data using restconf and yang
Cisco IOS-XE Operations Program. Shows operational data using restconf and yang

XE-Ops View operational and config data from devices running Cisco IOS-XE software. NoteS The build folder is the latest build. All other files are fo

Run python scripts and pass data between multiple python and node processes using this npm module

Run python scripts and pass data between multiple python and node processes using this npm module. process-communication has a event based architecture for interacting with python data and errors inside nodejs.

ARRU seismic backprojection - Earthquake waveform detection and P/S arrivals picking on continuous data using ARRU phase picker Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

Comments
  • UserWarning: multiple_iterators not implemented for CRAM

    UserWarning: multiple_iterators not implemented for CRAM

    When running with .cram file, got the following warnings /gauchian/depth_calling/snp_count.py:131: UserWarning: multiple_iterators not implemented for CRAM ignore_orphan=False /gauchian/depth_calling/haplotype.py:189: UserWarning: multiple_iterators not implemented for CRAM min_base_quality=13

    Will these warnings affect the quality of calls?

    opened by LNGDingj 1
Releases(v1.0.2)
Owner
Illumina
Illumina Open Source Software
Illumina
Mnemosyne: efficient learning with powerful digital flash-cards.

Mnemosyne: Optimized Flashcards and Research Project Mnemosyne is: a free, open-source, spaced-repetition flashcard program that helps you learn as ef

359 Dec 24, 2022
Nextstrain build targeted to Omicron

About This repository analyzes viral genomes using Nextstrain to understand how SARS-CoV-2, the virus that is responsible for the COVID-19 pandemic, e

Bedford Lab 9 May 25, 2022
Python Multilingual Ucrel Semantic Analysis System

PymUSAS Python Multilingual Ucrel Semantic Analysis System, it currently is a rule based token level semantic tagger which can be added to any spaCy p

UCREL 13 Nov 18, 2022
Simplified web browser made in python for a college project

Python browser Simplified web browser made in python for a college project. Web browser has bookmarks, history, multiple tabs, toolbar. It was made on

AmirHossein Mohammadi 9 Jul 25, 2022
A good Tool to comment on xmw

A good Tool to comment on xmw

1 Feb 10, 2022
💘 Write any Python with 9 Characters: e,x,c,h,r,(,+,1,)

💘 PyFuck exchr(+1) PyFuck is a strange playful code. It uses only nine different characters to write Python3 code. Inspired by aemkei/jsfuck Example

Satoki 10 Dec 25, 2022
A set of tools for ripping music from Konami mobile games

Konami Mobile Ripping Toolset A set of tools for ripping music from Konami mobile games Contents nigger.py for niggering konami's website, ripping all

5 Oct 20, 2022
Cool little Python scripts & projects I've made.

Little Python Projects A repository for neat little Python scripts I've made! How to run a script: *NOTE: You'll need to install Python v3 or higher.

dood 1 Jan 19, 2022
Trackthis - This library can be used to track USPS and UPS shipments.

Trackthis - This library can be used to track USPS and UPS shipments. It has the option of returning the raw API response, or optionally, it can be used to standardize the USPS and UPS responses so t

Aaron Guzman 0 Mar 29, 2022
Jannik Ramrath 1 Feb 05, 2022
This is collection of Managementsystem programs: Hospital Management, Student Managemen, etc

Contribute in this repository and help other students with their assignment by adding python scripts for various management system programs.

GDSC BVP DET - Navi Mumbai 3 Mar 20, 2022
An open-source systems and controls toolbox for Python3

harold A control systems package for Python=3.6. Introduction This package is written with the ambition of providing a full-fledged control systems s

Ilhan Polat 157 Dec 05, 2022
Button paginator using discord_components

Button Paginator With discord-components Button paginator using discord_components Welcome! It's a paginator for discord-componets! Thanks to the orig

Decave 7 Feb 12, 2022
Python version of RocketLeague-Dropshot-Calculated-shot

Python version of RocketLeague-Dropshot-Calculated-shot. This is just to demo around and a tool I used to develop the actual plugin.

JareBear 1 Jan 14, 2022
A small scale relica of bank management system using the MySQL queries in the python language.

Bank_Management_system This is a Bank Management System Database Project. Abstract: The main aim of the Bank Management Mini project is to keep record

Arun Singh Babal 1 Jan 27, 2022
Trashselected - Plugin for fman.io to move files that has been selected in fman to trash

TrashSelected Plugin for fman.io to move files that has been selected in fman to

1 Feb 04, 2022
Add your recently blog and douban states in your GitHub Profile

Add your recently blog and douban states in your GitHub Profile

Bingjie Yan 4 Dec 12, 2022
Job Guy Backend

جاب‌گای چیست؟ اونجا وضعیت چطوریه؟ یه سوال به همین کلیت و ابهام معمولا وقتی برای یه شرکت رزومه می‌فرستیم این سوال کلی و بزرگ برای همه پیش میاد.اونجا وض

Jobguy.work 217 Dec 25, 2022
Nicotine+: A graphical client for the SoulSeek peer-to-peer system

Nicotine+ Nicotine+ is a graphical client for the Soulseek peer-to-peer file sharing network. Nicotine+ aims to be a pleasant, Free and Open Source (F

940 Jan 03, 2023
Yandex Media Browser

Браузер медиа для плагина Yandex Station Включайте музыку, плейлисты и радио на Яндекс.Станции из Home Assistant! Скриншот Корневой раздел: Библиотека

Alexander Ryazanov 35 Dec 19, 2022