produces PCA on genotypes from fasta files (popPhyl's ID format)

Overview

popPhyl_PCA

Performs PCA of genotypes.
Works in two steps.

1. Input file

A single fasta file containing different loci, in different populations/species. Not necessarily sorted.
The ID (the line starting by >) of each sequence has to respect the following format:
`

E24_99631_p1|arabidopsis|E15|Allele_1 NNNNNNNNNNNAAAGAAGATGGCGTCGGCAGTTTCAGTATCGTTTATTGTGGTGAATATT TTGCTTCTCCTGGTTCAGGTCTTTGCTGGGAGAGACTTTTACAAAATATTGGGAGTTCCC AGAAACGCCGATTTGAAACAAATCAAGCGATCCTATCGAAAGCTGGCCAAAGAACTCCAC CCAGATAAGAACAAAGATGATCCTGAAGCAGAACAAAGATTTCAAGACTTAGGTGCTGCT ` Four different fields separated by a pipe (|), where:

  1. first field is the locus name (E24_99631_p1).
  2. second field is the species name (arabidopsis).
  3. third field is the name of the sampled diploid individual (E15).
  4. fourth field is the name of the allele (two alleles per individual, named either Allele_1 or Allele_2)

1. PCA

Single python command line (popphyl2PCA.py).
Before, you need to have these python dependencies available:

  1. pandas
  2. sklearn
  3. biopython

python3 ~/Programmes/popPhyl_PCA/popphyl2PCA.py [name of the subdirectory created by the script where output files will be written] [name of the input fasta file]

Example:
python3 ~/Programmes/popPhyl_PCA/popphyl2PCA.py ~/Documents/PCA/testPCA ~/Programmes/popPhyl_PCA/test.fas
Can takes between 10 minutes and 2 hours, depending on the number of SNPs and individuals.

2. vizualisation

Little Shiny interface (plotPCA.R).
Before, you need to have these R dependencies available:

  1. shiny
  2. plotly
  3. tidyverse
  4. shinycssloaders

Then, in R:

  1. source(~/Programmes/popPhyl_PCA/plotPCA.R)
  2. shinyApp(ui=ui, server=server)
  3. upload the files with coordinates (table_coord_PCA_genotypes.txt) and eigen values (table_eigen_PCA_genotypes.txt)
Owner
camille roux
PostDoc in Population Genomics; Speciation; Hybridization; Evolution of sex chromosomes; Backward+forward simulations.
camille roux
Keval allows you to call arbitrary Windows kernel-mode functions from user mode, even (and primarily) on another machine.

Keval Keval allows you to call arbitrary Windows kernel-mode functions from user mode, even (and primarily) on another machine. The user mode portion

42 Dec 17, 2022
Python code to remove empty folders from Windows/Android.

Empty Folder Cleaner is a program that deletes empty folders from your computer or device and removes clutter to improve performance. It supports only windows and android for now.

Dark Coder Cat | Vansh 4 Sep 27, 2022
Animation retargeting tool for Autodesk Maya. Retargets mocap to a custom rig with a few clicks.

Animation Retargeting Tool for Maya A tool for transferring animation data and mocap from a skeleton to a custom rig in Autodesk Maya. Installation: A

Joaen 63 Jan 06, 2023
Check the basic quality of any dataset

Data Quality Checker in Python Check the basic quality of any dataset. Sneak Peek Read full tutorial at Medium. Explore the app Requirements python 3.

MalaDeep 8 Feb 23, 2022
Dice Rolling Simulator using Python-random

Dice Rolling Simulator As the name of the program suggests, we will be imitating a rolling dice. This is one of the interesting python projects and wi

PyLaboratory 1 Feb 02, 2022
A utility that makes it easy to work with Python projects containing lots of packages, of which you only want to develop some.

Mixed development source packages on top of stable constraints using pip mxdev [mɪks dɛv] is a utility that makes it easy to work with Python projects

BlueDynamics Alliance 6 Jun 08, 2022
New time-based UUID formats which are suited for use as a database key

uuid6 New time-based UUID formats which are suited for use as a database key. This module extends immutable UUID objects (the UUID class) with the fun

26 Dec 30, 2022
✨ Un générateur de mot de passe aléatoire totalement fait en Python par moi, et en français.

Password Generator ❗ Un générateur de mot de passe aléatoire totalement fait en Python par moi, et en français. 🔮 Grâce a une au module random et str

MrGabin 3 Jul 29, 2021
A BlackJack simulator in Python to simulate thousands or millions of hands using different strategies.

BlackJack Simulator (in Python) A BlackJack simulator to play any number of hands using different strategies The Rules To keep the code relatively sim

Hamid 4 Jun 24, 2022
✨ Un bot Twitter totalement fait en Python par moi, et en français.

Twitter Bot ❗ Un bot Twitter totalement fait en Python par moi, et en français. Il faut remplacer auth = tweepy.OAuthHandler(consumer_key, consumer_se

MrGabin 3 Jun 06, 2021
A repo for working with and building daos

DAO Mix DAO Mix About How to DAO No Code Tools Getting Started Prerequisites Installation Usage On-Chain Governance Example Off-Chain governance Examp

Brownie Mixes 86 Dec 19, 2022
pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.

pydsinternals - Directory Services Internals Library A Python native library containing necessary classes, functions and structures to interact with W

Podalirius 36 Dec 14, 2022
A tool to create the basics of a project

Project-Scheduler Instalação Para instalar o Project Maker, você necessita está em um ambiente de desenvolvimento Linux ou wsl com alguma distro debia

2 Dec 17, 2021
A simple tool to extract python code from a Jupyter notebook, and then run pylint on it for static analysis.

Jupyter Pylinter A simple tool to extract python code from a Jupyter notebook, and then run pylint on it for static analysis. If you find this tool us

Edmund Goodman 10 Oct 13, 2022
Course-parsing - Parsing Course Info for NIT Kurukshetra

Parsing Course Info for NIT Kurukshetra Overview This repository houses code for

Saksham Mittal 3 Feb 03, 2022
aws ec2.py companion script to generate sshconfigs with auto bastion host discovery

ec2-bastion-sshconfig This script will interate over instances found by ec2.py and if those instances are not publically accessible it will search the

Steve Melo 1 Sep 11, 2022
Tools for binary data on cassette

Micro Manchester Tape Storage Tools for storing binary data on cassette Includes: Python script for encoding Arduino sketch for decoding Eagle CAD fil

Zack Nelson 28 Dec 25, 2022
RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

Max Bachmann 1.7k Jan 04, 2023
It is a tool that looks for a specific username in social networks

It is a tool that looks for a specific username in social networks

MasterBurnt 6 Oct 07, 2022
This is a package that allows you to create a key-value vault for storing variables in a global context

This is a package that allows you to create a key-value vault for storing variables in a global context. It allows you to set up a keyring with pre-defined constants which act as keys for the vault.

Data Ductus 2 Dec 14, 2022