Clean and reusable data-sciency notebooks.

Last update: Jan 28, 2022

Related tags

Data Analysis kpacubo

Overview

KPACUBO

KPACUBO is a set Jupyter notebooks focused on the best practices in both software development and data science, namely,

code reuse,
explicit data gathering procedures,
efficient data management,
clear analysis validation criteria.

SETUP

Create kpacubo environment:

conda env create -f environment.yml

Add kpacubo environment to Jupyter:

python -m ipykernel install --user --name=kpacubo

ENVIRONMENT EXPORT

Windows: conda env export --no-builds | findstr -v "prefix" > environment.yml

Linux: conda env export --no-builds | grep -v "prefix" > environment.yml

Owner

Matvey Morozov

GitHub Repository

Implementation in Python of the reliability measures such as Omega.

OmegaPy Summary Simple implementation in Python of the reliability measures: Omega Total, Omega Hierarchical and Omega Hierarchical Total. Name Link O

2 Apr 27, 2022

CRISP: Critical Path Analysis of Microservice Traces

CRISP: Critical Path Analysis of Microservice Traces This repo contains code to compute and present critical path summary from Jaeger microservice tra

110 Jan 06, 2023

signac-flow - manage workflows with signac

signac-flow - manage workflows with signac The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, a

44 Oct 14, 2022

Extract Thailand COVID-19 Cluster data from daily briefing pdf.

Thailand COVID-19 Cluster Data Extraction About Extract Clusters from Thailand Daily COVID-19 briefing PDF Download latest data Here. Data will be upd

5 Sep 27, 2021

Detecting Underwater Objects (DUO)

Underwater object detection for robot picking has attracted a lot of interest. However, it is still an unsolved problem due to several challenges. We take steps towards making it more realistic by ad

27 Dec 12, 2022

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.

Please consider citing the manuscript if you use apricot in your academic work! You can find more thorough documentation here. apricot implements subm

457 Dec 20, 2022

MS in Data Science capstone project. Studying attacks on autonomous vehicles.

Surveying Attack Models for CAVs Guide to Installing CARLA and Collecting Data Our project focuses on surveying attack models for Connveced Autonomous

1 Dec 09, 2021

This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks

4 Mar 06, 2022

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Larch: Data Analysis Tools for X-ray Spectroscopy and More Documentation: http://xraypy.github.io/xraylarch Code: http://github.com/xraypy/xraylarch L

95 Dec 13, 2022

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Elicited Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations. Credit to Brett Hoove

3 Nov 04, 2022

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Covid County Executive summary Setup Install miniconda, then in the command line, run conda create -n covid-county conda activate covid-county conda i

1 Dec 22, 2021

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods Introduction Graph Neural Networks (GNNs) have demonstrated

37 Dec 15, 2022

Using Python to derive insights on particular Pokemon, Types, Generations, and Stats

Pokémon Analysis Andreas Nikolaidis February 2022 Introduction Exploratory Analysis Correlations & Descriptive Statistics Principal Component Analysis

1 Feb 18, 2022

MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b

25 Dec 26, 2022

Python Package for DataHerb: create, search, and load datasets.

The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.

4 Feb 11, 2022

Approximate Nearest Neighbor Search for Sparse Data in Python!

Approximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents).

906 Jan 01, 2023

Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 01, 2023

Techdegree Data Analysis Project 2

Basketball Team Stats Tool In this project you will be writing a program that reads from the "constants" data (PLAYERS and TEAMS) in constants.py. Thi

2 Oct 23, 2021

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

14 Aug 19, 2022

Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-

293 Dec 29, 2022