Snakemake worflow to process and filter long read data from Oxford Nanopore Technologies.

Overview

Nanopore-Workflow

Snakemake workflow to process and filter long read data from Oxford Nanopore Technologies. It is designed to compare whole human genome tumor/normal pairs, but can also run individual samples. Reports and plots are generated for de novo genome assembly, differentially methylated regions, copy number variants, and structural variants. Filtering heuristics typically reduce the reported translocations to the break points. It is suggested to have at least 15x - 20x of coverage, and a median read length of at least 5kbp - 6kbp.

nanopore_workflow

Installation instructions

Download the latest code from GitHub:

git clone https://github.com/mike-molnar/nanopore-workflow.git

Before running the workflow, you will need to download the reference genome. I have not included the download as part of the workflow because it is designed to run a cluster that may not have internet access. You can use a local copy of GRCh38 if you have one, but the chromosomes must be named chr1, chr2, ... , and the reference can only contain the autosomes and sex chromosomes. To download the reference genome and index it, change to the reference directory of the workflow and run the script:

cd /path/to/nanopore-workflow/reference
chmod u+x download_reference.sh
./download_reference.sh

To run the workflow copy the Snakefile and config.yaml files to the directory that you want to run the workflow:

cp /path/to/nanopore-workflow/Snakefile /path/to/nanopore-workflow/config.yaml /path/to/samples

Modify the config.yaml file to represent the information for the necessary files and directories of your sample(s). The workflow is currently designed to have a single FASTQ, and a single sequencing summary file in a folder named fastq that is in a folder named after the sample. The config.yaml file provides an example of how to format the initial files and directories before running the workflow.

To run on a grid engine

There are a few different grid engines, so the exact format may be different for your particular grid engine. To run everything except the de novo assembly on a Univa grid engine:

snakemake --jobs 500 --rerun-incomplete --keep-going --latency-wait 30 --cluster "qsub -cwd -V -o snakemake.output.log -e snakemake.error.log -q queue_name -P project_name -pe smp {threads} -l h_vmem={params.memory_per_thread} -l h_rt={params.run_time} -b y" all_but_assembly

You will have to replace queue_name and project_name with the necessary values to run on your grid.

Dependencies

There are many dependencies, so it is best to create a new Conda environment using the YAML files in the env directory. There is a YAML file for the workflow, and another for Medaka. You will need to install a separate environment for QUAST if you are going to run the de novo assembly portion of the workflow. Change to the env directory and create the environments with Conda:

cd /path/to/nanopore-worflow/env
conda env create -n nanopore-workflow -f nanopore-workflow_env.yml
conda env create -n medaka -f medaka_env.yml
conda env create -n quast -f quast_env.yml
conda env create -n R_env -f R_env.yml
conda activate nanopore-workflow

Before running the workflow you will need to export the paths of the four environments to your PATH variable:

export PATH="/path/to/conda/envs/nanopore-workflow/bin:$PATH"
export PATH="/path/to/conda/envs/medaka/bin:$PATH"
export PATH="/path/to/conda/envs/quast/bin:$PATH"
export PATH="/path/to/conda/envs/R_env/bin:$PATH"

nanopore-workflow dependencies:

  • bcftools
  • bedtools
  • cutesv
  • flye
  • longshot
  • nanofilt v2.8.0
  • nanoplot v1.20.0
  • nanopolish
  • seaborn v0.10.0
  • snakemake
  • sniffles
  • survivor
  • svim
  • whatshap
  • winnowmap

R_env dependencies:

  • bioconductor-karyoploter
  • bioconductor-txdb.hsapiens.ucsc.hg38.knowngene
  • bioconductor-org.hs.eg.db
  • bioconductor-dss
  • r-tidyverse
You might also like...
 A simple way to read and write LAPS passwords from linux.
A simple way to read and write LAPS passwords from linux.

A simple way to read and write LAPS passwords from linux. This script is a python setter/getter for property ms-Mcs-AdmPwd used by LAPS inspired by @s

 ⚙️ Compile, Read and update your .conf file in python
⚙️ Compile, Read and update your .conf file in python

⚙️ Compile, Read and update your .conf file in python

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences! Posts are organized by country and destination within that country.

To lazy to read your homework ? Get it done with LOL

LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::

Pequenos programas variados que estou praticando e implementando, leia o Read.me!

my-small-programs Pequenos programas variados que estou praticando e implementando! Arquivo: automacao Automacao de processos de rotina com código Pyt

Show my read on kindle this year

Show my kindle status on GitHub

Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques

PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every

These are After Effects and Python files that were made in the process of creating the video for the contest.

spirograph These are After Effects and Python files that were made in the process of creating the video for the contest. In the python file you can qu

Releases(v0.1.0)
Do you need a screensaver for CircuitPython? Of course you do

circuitpython_screensaver Do you need a screensaver for CircuitPython? Of course you do Demo video of dvdlogo screensaver: screensaver_dvdlogo.mp4 Dem

Tod E. Kurt 8 Sep 02, 2021
An Advent calendar of small programming puzzles for a variety of skill sets and skill levels.

Advent of Code 2021 The Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be sol

Evan Cope 0 Feb 13, 2022
Watcher for systemdrun user scopes

Systemctl Memory Watcher Animated watcher for systemdrun user scopes. Usage Launch some process in your GNU-Linux or compatible OS with systemd-run co

Antonio Vanegas 2 Jan 20, 2022
The Playwright Workshop for TAU: The Homecoming

tau-playwright-workshop This repository contains the instructions and example code for the Playwright workshop for TAU: The Homecoming on December 1,

Pandy Knight 134 Dec 30, 2022
A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python

A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python

Windel Bouwman 277 Dec 26, 2022
Exam assignment for Laboratory of Bioinformatics 2

Exam assignment for Laboratory of Bioinformatics 2 (Alma Mater University of Bologna, Master in Bioinformatics)

2 Oct 22, 2022
Free APN For Python

Free APN For Python

XENZI GANZZ 4 Apr 22, 2022
dynamically create __slots__ objects with less code

slots_factory Factory functions and decorators for creating slot objects Slots are a python construct that allows users to create an object that doesn

Michael Green 2 Sep 07, 2021
Howell County, Missouri, COVID-19 data and (unofficial) estimates

COVID-19 in Howell County, Missouri This repository contains the daily data files used to generate my COVID-19 dashboard for Howell County, Missouri,

Jonathan Thornton 0 Jun 18, 2022
Used the pyautogui library to automate some processes on the computer

Pyautogui Utilizei a biblioteca pyautogui para automatizar alguns processos no c

Dheovani Xavier 1 Dec 30, 2021
Demo repository for Saltconf21 talk - Testing strategies for Salt states

Saltconf21 testing strategies Demonstration repository for my Saltconf21 talk "Strategies for testing Salt states" Talk recording Slides and demos Get

Barney Sowood 3 Mar 31, 2022
Extend the maya channel box with searchability and colour

channel-box-plus will add search-ability over its attributes, and it will colour user defined attributes, making them easier to distinguish.

Robert Joosten 12 Jun 08, 2022
Semester Project on Signal Processing @CS UCU 2021

Blur Detection with Haar Wavelet Transform Requirements Python3 opencv-python PyWavelets Install these using the following command: $ pip install -r r

ButynetsD 2 Oct 15, 2022
Software that extracts spreadsheets from various .pdf files to .csv

Extração de planilhas de diversos arquivos .pdf para .csv O código inteiro foi desenvolvido em Python. Foi utilizado o pacote "tabula" e a biblioteca

Marcos Silva 2 Jan 09, 2022
Height 2 LDraw With python

Height2Ldraw About This project aims to be able to make a full lego 3D model using the ldraw file format (.ldr) from a height and color map, currently

1 Dec 22, 2021
Assignment for python course, BUPT 2021.

pyFuujinrokuDestiny Assignment for python course, BUPT 2021. Notice username and password must be ASCII encoding. If username exists in database, syst

Ellias Kiri Stuart 3 Jun 18, 2021
Utility/Raiding selfbot made by Shell and Roover.

Utility/Raiding selfbot made by Shell and Roover. We are open to suggestions and ideas.

Shell 2 Dec 08, 2021
Union oichecklists For Python

OI Checklist Union Auto-Union user's OI Checklists. Just put your checklist's ID in and it works. How to use it? Put all your OI Checklist IDs (that i

FHVirus 4 Mar 30, 2022
⚙️ Compile, Read and update your .conf file in python

⚙️ Compile, Read and update your .conf file in python

Reece Harris 2 Aug 15, 2022
Fetch PRs from GitHub and analyze which ones are unmergeable

Set up token Generate a personal access token on GitHub. Add repo permissions. export GH_TOKEN="abcdefg" Pull PR data make Usually, GitHub doesn't h

Stefan van der Walt 1 Nov 05, 2021