A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

Related tags

Deep Learningjie
Overview

jie

jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model.

The codename is a tribute to the Chinese homophones:

  • 结 (jié) : a knot, a nod to the mysterious and often entangled structures of DNA
  • 解 (jiĕ) : to solve, to untie, our bid to uncover these structures amid noise and uncertainty
  • 姐 (jiĕ) : sister, our ability to resolve tightly paired replicated chromatids

Installation

Step 1 - Clone this repository:

git clone https://github.com/b2jia/jie.git
cd jie

Step 2 - Create a new conda environment and install dependencies:

conda create --name jie -f environment.yml
conda activate jie

Step 3 - Install jie:

pip install -e .

To test, run:

python -W ignore test/test_jie.py

Usage

jie is an exposition of chromatin tracing using polymer physics. The main function of this package is to illustrate the utility and power of spatial genome alignment.

jie is NOT an all-purpose spatial genome aligner. Chromatin imaging is a nascent field and data collection is still being standardized. This aligner may not be compatible with different imaging protocols and data formats, among other variables.

We provide a vignette under jie/jupyter/, with emphasis on inspectability. This walks through the intuition of our spatial genome alignment and polymer fiber karyotyping routines:

00-spatial-genome-alignment-walk-thru.ipynb

We also provide a series of Jupyter notebooks (jie/jupyter/), with emphasis on reproducibility. This reproduces figures from our accompanying manuscript:

01-seqFISH-plus-mouse-ESC-spatial-genome-alignment.ipynb
02-seqFISH-plus-mouse-ESC-polymer-fiber-karyotyping.ipynb
03-seqFISH-plus-mouse-brain-spatial-genome-alignment.ipynb
04-seqFISH-plus-mouse-brain-polymer-fiber-karyotyping.ipynb
05-bench-mark-spatial-genome-agignment-against-chromatin-tracing-algorithm.ipynb

A command-line tool forthcoming.

Motivation

Multiplexed DNA-FISH is a powerful imaging technology that enables us to peer directly at the spatial location of genes inside the nucleus. Each gene appears as tiny dot under imaging.

Pivotally, figuring out which dots are physically linked would trace out the structure of chromosomes. Unfortunately, imaging is noisy, and single-cell biology is extremely variable. The two confound each other, making chromatin tracing prohibitively difficult!

For instance, in a diploid cell line with two copies of a gene we expect to see two spots. But what happens when we see:

  • Extra signals:
    • Is it noise?
      • Off-target labeling: The FISH probes might inadvertently label an off-target gene
    • Or is it biological variation?
      • Aneuploidy: A cell (ie. cancerous cell) may have more than one copy of a gene
      • Cell cycle: When a cell gets ready to divide, it duplicates its genes
  • Missing signals:
    • Is it noise?
      • Poor probe labeling: The FISH probes never labeled the intended target gene
    • Or is it biological variation?
      • Copy Number Variation: A cell may have a gene deletion

If true signal and noise are indistinguishable, how do we know we are selecting true signals during chromatin tracing? It is not obvious which spots should be connected as part of a chromatin fiber. This dilemma was first aptly characterized by Ross et al. (https://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.011918), which is nothing short of prescient...!

jie is, conceptually, a spatial genome aligner that disambiguates spot selection by checking each imaged signal against a reference polymer physics model of chromatin. It relies on the key insight that the spatial separation between two genes should be congruent with its genomic separation.

It makes no assumptions about the expected copy number of a gene, and when it traces chromatin it does so instead by evaluating the physical likelihood of the chromatin fiber. In doing so, we can uncover copy number variations and even sister chromatids from multiplexed DNA-FISH imaging data.

Citation

Contact

Author: Bojing (Blair) Jia
Email: b2jia at eng dot ucsd dot edu
Position: MD-PhD Student, Ren Lab

For other work related to single-cell biology, 3D genome, and chromatin imaging, please visit Prof. Bing Ren's website: http://renlab.sdsc.edu/

Owner
Bojing Jia
How do we better describe the world around us?
Bojing Jia
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Simple tutorials on Pytorch DDP training

pytorch-distributed-training Distribute Dataparallel (DDP) Training on Pytorch Features Easy to study DDP training You can directly copy this code for

Ren Tianhe 188 Jan 06, 2023
Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a

NASA Jet Propulsion Laboratory 8 Nov 06, 2022
Code base for NeurIPS 2021 publication titled Kernel Functional Optimisation (KFO)

KernelFunctionalOptimisation Code base for NeurIPS 2021 publication titled Kernel Functional Optimisation (KFO) We have conducted all our experiments

2 Jun 29, 2022
This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

RUAS This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision" A prelimin

Vision & Optimization Group (VOG) 2 May 05, 2022
Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

From global to local MDI variable importances for random forests and when they are Shapley values Antonio Sutera ( Antonio Sutera 3 Feb 23, 2022

Data Augmentation with Variational Autoencoders

Documentation Pyraug This library provides a way to perform Data Augmentation using Variational Autoencoders in a reliable way even in challenging con

112 Nov 30, 2022
CMP 414/765 course repository for Spring 2022 semester

CMP414/765: Artificial Intelligence Spring2021 This is the GitHub repository for course CMP 414/765: Artificial Intelligence taught at The City Univer

ch00226855 4 May 16, 2022
An University Project of Quera Web Crawling.

WebCrawlerProject An University Project of Quera Web Crawling. خزشگر اینستاگرام در این پروژه شما باید با استفاده از کتابخانه های زیر یک خزشگر اینستاگر

Mahdi 3 Aug 12, 2022
[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

MVSNeRF Project page | Paper This repository contains a pytorch lightning implementation for the ICCV 2021 paper: MVSNeRF: Fast Generalizable Radiance

Anpei Chen 529 Dec 30, 2022
Simple-Neural-Network From Scratch in Python

Simple-Neural-Network From Scratch in Python This is a simple Neural Network created without any Machine Learning Libraries. The only dependencies are

Aum Shah 1 Dec 28, 2021
Pointer networks Tensorflow2

Pointer networks Tensorflow2 原文:https://arxiv.org/abs/1506.03134 仅供参考与学习,内含代码备注 环境 tensorflow==2.6.0 tqdm matplotlib numpy 《pointer networks》阅读笔记 应用场景

HUANG HAO 7 Oct 27, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
NLMpy - A Python package to create neutral landscape models

NLMpy is a Python package for the creation of neutral landscape models that are widely used by landscape ecologists to model ecological patterns

Manaaki Whenua – Landcare Research 1 Oct 08, 2022
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region (Paper and DataSet). [New] Note that all the emails about the download permission o

Healthcare Intelligence Laboratory 71 Dec 22, 2022
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Clothes Parsing Overview This code provides an implementation of the research paper: A High Performance CRF Model for Clothes Parsing Edgar Simo-S

Edgar Simo-Serra 119 Nov 21, 2022
[ICLR'21] Counterfactual Generative Networks

This repository contains the code for the ICLR 2021 paper "Counterfactual Generative Networks" by Axel Sauer and Andreas Geiger. If you want to take the CGN for a spin and generate counterfactual ima

88 Jan 02, 2023
Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

1 Feb 14, 2022
Implementation of PyTorch-based multi-task pre-trained models

mtdp Library containing implementation related to the research paper "Multi-task pre-training of deep neural networks for digital pathology" (Mormont

Romain Mormont 27 Oct 14, 2022