Python3 command-line tool for the inference of Boolean rules and pathway analysis on omics data

Overview

BONITA-Python3

BONITA was originally written in Python 2 and tested with Python 2-compatible packages. This version of the packages ports BONITA to Python 3. Functionality remains the same. However, we refer users to the original release to reproduce figures from the BONITA paper.

BONITA- Boolean Omics Network Invariant-Time Analysis is a package for the inference of Boolean rules and pathway analysis on omics data. It can be applied to help uncover underlying relationships in biological data. Please see our publication for more information.

Authors: Rohith Palli (https://www.github.com/rpalli), Mukta G. Palshikar and Juilee Thakar

**BONITA ported to Python 3 by Mukta G. Palshikar (https://www.github.com/mgp13) and Jiayue Meng (https://www.github.com/jiayuemeng) **

For a demonstration of the BONITA pipeline, see the tutorial in Tutorials/BONITA_pipeline_tutorial.md. The instructions in the current README file cover all anticipated use cases.

Maintainer: Please contact Juilee Thakar at [email protected]

Citation

We would appreciate the citation of our manuscript describing the original BONITA release, below, for any use of our code.

Palli R, Palshikar MG, Thakar J (2019) Executable pathway analysis using ensemble discrete-state modeling for large-scale data. PLoS Comput Biol 15(9): e1007317. (https://doi.org/10.1371/journal.pcbi.1007317)

Installation

BONITA is designed for use with distributed computing systems. Necessary SLURM commands are included. If users are having trouble translating to PBS or other queueing standards for their computing environment, please contact Juilee Thakar at [email protected]

Create a conda environment to run BONITA

Use a terminal, or an Anaconda Prompt for the following:

  1. Create a conda environment using the provided YML file

conda env create –name BONITA --file platform_BONITA.yaml

  1. Activate the BONITA environment

activate BONITA

  1. Check that the BONITA environment is available and correctly installed:

conda info --envs

Install BONITA

You can download and use BONITA in one of two ways:

  1. Download a zipped folder containing all the files you need (github download link in green box above and to the right)
  2. Clone this git repository in the folder of your choice using the command

git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY

Next, the C code must be compiled using the make file. Simply type make while in the BONITA folder. make

Now you have a fully functional distribution of BONITA! Time to gather your data and get started.

Usage

You will need the following files to run BONITA:

  • omics data as a plaintext table (csv, tsv, or similar) with the first row containing a holder for gene symbol column then sample names and subsequent rows containing gene symbol in first column and column-normalized (rpm or rpkm in transcriptomics) abundance measures in other columns.
  • gmt file with list of KEGG pathways to be considered (can be downloaded from msigdb)
  • matrix of conditions with each line representing a sample and the first column containing the names of the samples and subsequent columns describing 1/0 if the sample is part of that condition or not.
  • list of contrasts you would like to run with each contrast on a single line

There are three main steps in BONITA: prepare pathways for rule inference, rule inference, and pathway analysis. All necessary files for an example run are provided in the Tutorials folder . The preparation step requires internet access to access the KEGG API.

Step 1: Pathway preparation

See the bash script pathwayPreparation.sh for examples

This step requires internet access.

There are three ways to complete this process:

  1. on a gmt of human pathways
  2. on all KEGG pathways for any organism, or
  3. on a list of KEGG pathways for any organism

Only Option 1 was used and tested in our manuscript. Caution should be exercised in interpreting results of other two methods. At a minimum, graphmls with impact scores and relative abundance should be examined before drawing conclusions about pathway differences.

Option 1: On a gmt of human pathways

BONITA needs omics data, gmt file, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. The commands are below:

comma separated: python pathway_analysis_setup.py -gmt Your_gmt_file -sep , Your_omics_data

tab separated: python pathway_analysis_setup.py -t -gmt Your_gmt_file Your_omics_data

Option 2: On all KEGG pathways for any organism

BONITA needs omics data, organism code, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. A three letter organism code from KEGG must be provided (lower case). Example codes include mmu for mouse and hsa for human. The commands are below: comma separated: python pathway_analysis_setup.py -org Your_org_code -sep , Your_omics_data

comma separated, human: python pathway_analysis_setup.py -org hsa -sep , Your_omics_data

comma separated, mouse: python pathway_analysis_setup.py -org mmu -sep , Your_omics_data

tab separated: python pathway_analysis_setup.py -t -org Your_org_code Your_omics_data

Option 3: On a list of KEGG pathways for any organism

BONITA needs omics data, organism code, the list of pathways, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. A three letter organism code from KEGG must be provided (lower case). Example codes include mmu for mouse and hsa for human. The list of pathways must include the 5 digit pathway identifier, must be seperated by commas, and must not include any other numbers. An example paths.txt is included in the inputData folder. The commands are below: comma separated: python pathway_analysis_setup.py -org Your_org_code -sep , -paths Your_pathway_list Your_omics_data

comma separated, human: python pathway_analysis_setup.py -org hsa -sep , -paths Your_pathway_list Your_omics_data

comma separated, mouse: python pathway_analysis_setup.py -org mmu -sep , -paths Your_pathway_list Your_omics_data

tab separated: python pathway_analysis_setup.py -t -org Your_org_code -paths Your_pathway_list Your_omics_data

Step 2: Rule inference

Simply run the script find_rules_pathway_analysis.sh which will automatically submit appropriate jobs to SLURM queue:

bash find_rules_pathway_analysis.sh

Step 3: Pathway Analysis

To accomplish this, the proper inputs must be provided to pathway_analysis_score_pathways.py. The cleaup.sh script will automatically put output of rule inference step into correct folders.

bash cleanup.sh

Then run the pathway analysis script:

python pathway_analysis_score_pathways.py Your_omics_data Your_condition_matrix Your_desired_contrasts -sep Separator_used_in_gmt_and_omics_data

If your files are tab separated, then the following command can be used: python pathway_analysis_score_pathways.py -t Your_omics_data Your_condition_matrix Your_desired_contrasts

Owner
Thakar lab uses AI and systems biology approaches to identify immune signatures that can predict outcome of an immune response to infections or vaccinations.
A CLI Application to detect plagiarism in Source Code Files.

Plag Description A CLI Application to detect plagiarism in Source Code Files. Features Compare source code files for plagiarism. Extract code features

default=dev 2 Nov 10, 2022
A simple python script to execute a command when a YubiKey is disconnected

YubiKeyExecute A python script to execute a command when a YubiKey / YubiKeys are disconnected. ‏‏‎ ‎ How to use: 1. Download the latest release and d

6 Mar 12, 2022
A cli tool , which shows you all the next possible words you can guess from in the game of Wordle.

wordle-helper A cli tool , which shows you all the next possible words you can guess from the Game Wordle. This repo has the code discussed in the You

1 Jan 17, 2022
A Python module and command line utility for working with web archive data using the WACZ format specification

py-wacz The py-wacz repository contains a Python module and command line utility for working with web archive data using the WACZ format specification

Webrecorder 14 Oct 24, 2022
Python API and CLI for the ikea IDÅSEN desk.

idasen This is a heavily modified fork of rhyst/idasen-controller. The IDÅSEN is an electric sitting standing desk with a Linak controller sold by ike

Alex 79 Dec 14, 2022
An interactive cheatsheet tool for the command-line

navi An interactive cheatsheet tool for the command-line. navi allows you to browse through cheatsheets (that you may write yourself or download from

Denis Isidoro 12.2k Dec 31, 2022
A stupidly simple task list to keep you productive and focused.

StupidlySimple-TaskList A stupidly simple task list to keep you productive and focused. There is really nothing to it. This is a terminal-based script

Jack Soderstrom 1 Nov 28, 2021
Simple CLI prompt for easy I/O with OpenAI's API

openai-cli-prompt Simple CLI prompt for easy I/O with OpenAI's API Quickstart Create a .env file with: OPENAI_API_KEY=Your OpenAI API Key Configure

Erik Nomitch 1 Oct 12, 2021
split-manga-pages: a command line utility written in Python that converts your double-page layout manga to single-page layout.

split-manga-pages split-manga-pages is a command line utility written in Python that converts your double-page layout manga (or any images in double p

Christoffer Aakre 3 May 24, 2022
Command-line script to upload videos to Youtube using theYoutube APIv3.

Introduction Command-line script to upload videos to Youtube using theYoutube APIv3. It should work on any platform (GNU/Linux, BSD, OS X, Windows, ..

Arnau Sanchez 1.9k Jan 09, 2023
A command-line utility that, given a markdown file, checks whether all its links work.

A command-line utility written in Python that checks validity of links in a markdown file.

Teclado 2 Dec 08, 2021
Doro is a CLI based pomodoro app and countdown timer application built using python.

Doro - CLI based pomodoro app Doro is a CLI based pomodoro app and countdown timer application built using python. Install $ pip install doro Usage Po

Suresh Kumar 14 May 23, 2022
CLI Web-CAT interface for people who use VIM.

CLI Web-CAT CLI Web-CAT interface. Installation git clone https://github.com/phuang1024/cliwebcat cd cliwebcat python setup.py bdist_wheel sdist cd di

Patrick 4 Apr 11, 2022
A python library for parsing multiple types of config files, envvars & command line arguments that takes the headache out of setting app configurations.

parse_it A python library for parsing multiple types of config files, envvars and command line arguments that takes the headache out of setting app co

Naor Livne 97 Oct 22, 2022
Shortcut-Maker - It is a tool that can be set to run any tool with a single command

Shortcut-Maker It is a tool that can be set to run any tool with a single command Coded by Dave Smith(Owner of Sl Cyber Warriors) Command list 👇 pkg

Dave Smith 10 Sep 14, 2022
Enlighten Progress Bar is a console progress bar library for Python.

Overview Enlighten Progress Bar is a console progress bar library for Python. The main advantage of Enlighten is it allows writing to stdout and stder

Rockhopper Technologies 265 Dec 28, 2022
Declarative CLIs with argparse and dataclasses

argparse_dataclass Declarative CLIs with argparse and dataclasses. Features Features marked with a ✓ are currently implemented; features marked with a

Mike DePalatis 29 Dec 06, 2022
Python CLI script to solve wordles.

Wordle Solver Python CLI script to solve wordles. You need at least python 3.8 installed to run this. No dependencies. Sample Usage Let's say the word

Rachel Brindle 1 Jan 16, 2022
A python command line tool to calculate options max pain for a given company symbol and options expiry date.

Options-Max-Pain-Calculator A python command line tool to calculate options max pain for a given company symbol and options expiry date. Overview - Ma

13 Dec 26, 2022