HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

Overview

HiFi DeepVariant + WhatsHap workflow

Workflow steps

  1. align HiFi reads to reference with pbmm2
  2. call small variants with DeepVariant, using two-pass method (DeepVariant ➡️ WhatsHap phase ➡️ WhatsHap haplotag ➡️ DeepVariant)
  3. phase small variants with WhatsHap
  4. haplotag aligned BAMs with WhatsHap and merge

Directory structure within basedir

.
├── cluster_logs  # slurm stderr/stdout logs
├── reference
│   ├── reference.chr_lengths.txt  # cut -f1,2 reference.fasta > reference.chr_lengths.txt
│   ├── reference.fasta
│   └── reference.fasta.fai
├── samples
│   └── 
   
      # sample_id regex: r'[A-Za-z0-9_-]+'
│       ├── whatshap/  # phased small variants; merged haplotagged alignments
│       ├── logs/  # per-rule stdout/stderr logs
│       ├── aligned/  # intermediate
│       ├── deepvariant/  # intermediate
│       ├── deepvariant_intermediate/  # intermediate
│       └── whatshap_intermediate/  # intermediate
├── smrtcells
│   ├── done  # move folders from smrtcells/ready to smrtcells/done to prevent re-processing
│   └── ready
│       └── 
    
       # uBAMs or FASTQs per sample
│                        # filename regex: r'm\d{5}[Ue]?_\d{6}_\d{6}).(ccs|hifi_reads).bam' or r'm\d{5}[Ue]?_\d{6}_\d{6}).fastq.gz'
└── workflow  # clone of this repo

    
   

To run the pipeline

$ conda create \
    --channel bioconda \
    --channel conda-forge \
    --prefix ./conda_env \
    python=3 snakemake mamba lockfile

$ conda activate ./conda_env

$ sbatch workflow/run_snakemake.sh <sample_id>
Owner
William Rowell
Geneticist & 1x programmer on a good day. BFX applications @PacificBiosciences.
William Rowell
Chinese Pre-Trained Language Models (CPM-LM) Version-I

CPM-Generate 为了促进中文自然语言处理研究的发展,本项目提供了 CPM-LM (2.6B) 模型的文本生成代码,可用于文本生成的本地测试,并以此为基础进一步研究零次学习/少次学习等场景。[项目首页] [模型下载] [技术报告] 若您想使用CPM-1进行推理,我们建议使用高效推理工具BMI

Tsinghua AI 1.4k Jan 03, 2023
PUA Programming Language written in Python.

pua-lang PUA Programming Language written in Python. Installation git clone https://github.com/zhaoyang97/pua-lang.git cd pua-lang pip install . Try

zy 4 Feb 19, 2022
Subtitle Workshop (subshop): tools to download and synchronize subtitles

SUBSHOP Tools to download, remove ads, and synchronize subtitles. SUBSHOP Purpose Limitations Required Web Credentials Installation, Configuration, an

Joe D 4 Feb 13, 2022
ElasticBERT: A pre-trained model with multi-exit transformer architecture.

This repository contains finetuning code and checkpoints for ElasticBERT. Towards Efficient NLP: A Standard Evaluation and A Strong Baseli

fastNLP 48 Dec 14, 2022
Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

Linear Transformers Are Secretly Fast Weight Programmers This repository contains the code accompanying the paper Linear Transformers Are Secretly Fas

Imanol Schlag 77 Dec 19, 2022
A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

Emily's Symbol Dictionary Design This dictionary was created with the following goals in mind: Have a consistent method to type (pretty much) every sy

Emily 68 Jan 07, 2023
Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning This repo is for Findings at EMNLP 2021 paper: Learn Cont

INK Lab @ USC 6 Sep 02, 2022
Code for the paper "Flexible Generation of Natural Language Deductions"

Code for the paper "Flexible Generation of Natural Language Deductions"

Kaj Bostrom 12 Nov 11, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Dec 26, 2022
Yodatranslator is a simple translator English to Yoda-language

yodatranslator Overview yodatranslator is a simple translator English to Yoda-language. Project is created for educational purposes. It is intended to

1 Nov 11, 2021
Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

wang zhang 0 Jun 28, 2022
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 15.3k Jan 03, 2023
A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

Won Joon Yoo 335 Jan 04, 2023
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"

BP-Transformer This repo contains the code for our paper BP-Transformer: Modeling Long-Range Context via Binary Partition Zihao Ye, Qipeng Guo, Quan G

Zihao Ye 119 Nov 14, 2022
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

BERT-of-Theseus Code for paper "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing". BERT-of-Theseus is a new compressed BERT by progre

Kevin Canwen Xu 284 Nov 25, 2022
Implementation for paper BLEU: a Method for Automatic Evaluation of Machine Translation

BLEU Score Implementation for paper: BLEU: a Method for Automatic Evaluation of Machine Translation Author: Ba Ngoc from ProtonX BLEU score is a popul

Ngoc Nguyen Ba 6 Oct 07, 2021
ChatterBot is a machine learning, conversational dialog engine for creating chat bots

ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on

Gunther Cox 12.8k Jan 03, 2023
HAIS_2GNN: 3D Visual Grounding with Graph and Attention

HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv

Yue Chen 1 Nov 26, 2022
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

simple_diarizer Simplified diarization pipeline using some pretrained models. Made to be a simple as possible to go from an input audio file to diariz

Chau 65 Dec 30, 2022
Paddlespeech Streaming ASR GUI

Paddlespeech-Streaming-ASR-GUI Introduction A paddlespeech Streaming ASR GUI. Us

Niek Zhen 3 Jan 05, 2022