Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

Overview

Description

image

Summary of phylogenomic methods and analyses used in "Immunogenicity of convalescent and vaccinated sera against clinical isolates of ancestral SARS-CoV-2, Beta, Delta, and Omicron variants"

Methods

Raw reads underwent adapter/quality trimming (trim-galore v0.6.5 [citation: https://github.com/FelixKrueger/TrimGalore]), host filtering and read mapping to reference (bwa v0.7.17 [citation: arXiv:1303.3997v2 ], samtools v.1.7 [citation: 10.1093/bioinformatics/btp352]) trimming of primers (iVar v1.3 [citation:10.1186/s13059-018-1618-7]) and variant/consensus calling (freebayes v1.3.2 [citation: arXiv:1207.3907]) using the SIGNAL workflow (https://github.com/jaleezyy/covid-19-signal) v1.4.4dev (#60dd466) [citation: doi.org/10.3390/v12080895] with the ARTICv4 amplicon scheme (from https://github.com/artic-network/artic-ncov2019) and the MN908947.3 SARS-CoV-2 reference genome and annotations. Additional quality control and variant effect annotation (SnpEff v5.0-0 [citation:0.4161/fly.19695]) was performed using the ncov-tools v1.8.0 (https://github.com/jts/ncov-tools/). Finally, PANGO lineages were assigned to consensus sequences using pangolin v3.1.17 (with the PangoLEARN v2021-12-06 models) [citation:10.1093/ve/veab064], scorpio v0.3.16 (with constellations v0.1.1) [citation: https://github.com/cov-lineages/scorpio], and PANGO-designations v1.2.117 [citation:10.1038/s41564-020-0770-5]. Variants were summarised using PyVCF v0.6.8 [citation:https://github.com/jamescasbon/PyVCF] and pandas v1.2.4 [citation:10.25080/Majora-92bf1922-00a]. Phylogenetic analysis was performed using augur v13.1.0 [citation: 10.21105/joss.02906] with IQTree (v2.2.0beta) [citation:10.1093/molbev/msaa015] and the resulting phylogenetic figure generated using ETE v3.1.2 [citation: 10.1093/molbev/msw046]. Contexual sequences were incorporated into the phylogenetic analysis by using Nexstrain's ingested GISAID metadata and pandas to randomly sample a representative subset of sequences (jointly deposited in NCBI and GISAID) that belonged to lineages observed in Canada (see sequences_used_in_tree_with_acknowledgements.tsv for metadata and acknowledgements).

File Description

  • 20220101_MN01513_WGS114_DEC31SRI_CK_summary_valid_negative_pass_only.tsv ncov-tools generate QC summary

  • sk_variant_summary.ipynb notebook containing code to summarise variants (tables/variant_percentage_read_support_protein_nonsynonymous_only.tsv and graphic figures/intermediate/spike_mutation_table_styled.png) and subsample representative genomes phlyogeny/seqs/open_context_genomes.fasta from GISAID (nextstrain ingested fasta and metadata from 2021-12-31: metadata_2021-12-31_17-29.tsv.gz and sequences_fasta_2022_01_03.tar.xz)

  • genomes/ Consensus sequences generated by FreeBayes via SIGNAL.

  • variants/ ncov-tools SnpEff annotated SIGNAL FreeBayes VCFs

  • phylogeny data used to generate annotated phylogeny with augur

  • phylogeny/tree.sh script used to generate phylogeny

  • phylogeny/seqs sequences used for phlyogeny

  • phylogeny/data reference data for phylogeny

  • phylogeny/augur phylogeny and intermediate files

  • phlyogeny/viz_tree.py ete3 based script to generate phylogeny figure (tree.svg)

  • figure files for generating result plot

  • figure/phylo_variant_figure.* final figure combining tree.svg and spike_mutation_table_styled.png

  • figure/intermediate/tree.svg rendered SVG of phylogeny

  • figure/intermediate/spike_mutation_table_styled.png rendered summary of variants

  • tables set of tables for manuscript

  • tables/sequences_used_in_tree_with_acknowledgements.tsv ncov-ingest metadata with acknowledgements

  • tables/variant_percentage_read_support_protein_nonsynonymous_only.tsv summary of variants

You might also like...
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Easily pull telemetry data and create beautiful visualizations for analysis.
Easily pull telemetry data and create beautiful visualizations for analysis.

This repository is a work in progress. Anything and everything is subject to change. Porpo Table of Contents Porpo Table of Contents General Informati

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.
Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

SARS-CoV-2 processing requests Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide. Prerequisites This autom

 TagLab: an image segmentation tool oriented to marine data analysis
TagLab: an image segmentation tool oriented to marine data analysis

TagLab: an image segmentation tool oriented to marine data analysis TagLab was created to support the activity of annotation and extraction of statist

Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

Releases(v0.1.1)
Owner
Finlay Maguire
Assistant Professor (Computer Science & Epidemiology). Working on infectious disease genomic epidemiology & data-driven solutions to social crises
Finlay Maguire
Official Code for "Non-deep Networks"

Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Overview: Depth is the hallmark of DNNs. But more depth m

Ankit Goyal 567 Dec 12, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet buil

3.4k Jan 07, 2023
This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Jiaqi Wang 42 Jan 07, 2023
Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Divide and Remaster Utility Tools Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper The DnR d

Darius Petermann 46 Dec 11, 2022
Chatbot in 200 lines of code using TensorLayer

Seq2Seq Chatbot This is a 200 lines implementation of Twitter/Cornell-Movie Chatbot, please read the following references before you read the code: Pr

TensorLayer Community 820 Dec 17, 2022
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

klein 125 Jan 03, 2023
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer

Csordás Róbert 24 Oct 15, 2022
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Sc

2 Dec 26, 2021
A repository for the paper "Improved Adversarial Systems for 3D Object Generation and Reconstruction".

Improved Adversarial Systems for 3D Object Generation and Reconstruction: This is a repository for the paper "Improved Adversarial Systems for 3D Obje

Edward Smith 188 Dec 25, 2022
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
This repository contains source code for the Situated Interactive Language Grounding (SILG) benchmark

SILG This repository contains source code for the Situated Interactive Language Grounding (SILG) benchmark. If you find this work helpful, please cons

Victor Zhong 17 Nov 27, 2022
Our VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks.

VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks. VMAgent is constructed based on one month r

56 Dec 12, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

118 Dec 26, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
CVAT is free, online, interactive video and image annotation tool for computer vision

Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our

OpenVINO Toolkit 8.6k Jan 04, 2023
A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

CLEVR Dataset Generation This is the code used to generate the CLEVR dataset as described in the paper: CLEVR: A Diagnostic Dataset for Compositional

Facebook Research 503 Jan 04, 2023
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
On-device speech-to-intent engine powered by deep learning

Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv

Picovoice 510 Dec 30, 2022