Scikit-Learn useful pre-defined Pipelines Hub

Last update: Apr 26, 2022

Overview

Scikit-Pipes

Scikit-Learn useful pre-defined Pipelines Hub

Usage:

Install scikit-pipes

It's advised to install sklearn-genetic using a virtual env, inside the env use:

pip install scikit-pipes

Example: Simple Preprocessing

import pandas as pd
import numpy as np
from skpipes.pipeline import SkPipeline

data = [{"x1": 1, "x2": 400, "x3": np.nan},
        {"x1": 4.8, "x2": 250, "x3": 50},
        {"x1": 3, "x2": 140, "x3": 43},
        {"x1": 1.4, "x2": 357, "x3": 75},
        {"x1": 2.4, "x2": np.nan, "x3": 42},
        {"x1": 4, "x2": 287, "x3": 21}]

df = pd.DataFrame(data)

pipe = SkPipeline(name='imputer_median-minmax',
                  data_type="numerical")
pipe.steps
str(pipe)

pipe.fit(df)
pipe.transform(df)
pipe.fit_transform(df)

Changelog

See the changelog for notes on the changes of Sklearn-genetic-opt

Important links

Official source code repo: https://github.com/rodrigo-arenas/scikit-pipes/
Download releases: https://pypi.org/project/scikit-pipes/
Issue tracker: https://github.com/rodrigo-arenas/scikit-pipes/issues
Stable documentation: https://scikit-pipes.readthedocs.io/en/stable/

Source code

You can check the latest development version with the command:

git clone https://github.com/rodrigo-arenas/scikit-pipes.git

Install the development dependencies:

pip install -r dev-requirements.txt

Check the latest in-development documentation: https://scikit-pipes.readthedocs.io/en/latest/

Testing

After installation, you can launch the test suite from outside the source directory:

pytest skpipes

Scikit-Learn useful pre-defined Pipelines Hub

Related tags

Overview

Scikit-Pipes

Usage:

Example: Simple Preprocessing

Changelog

Important links

Source code

Testing

Owner

Rodrigo Arenas

nn-Meter is a novel and efficient system to accurately predict the inference latency of DNN models on diverse edge devices

Evidently helps analyze machine learning models during validation or production monitoring

Dragonfly is an open source python library for scalable Bayesian optimisation.

To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction

Python module for data science and machine learning users.

Machine-Learning with python (jupyter)

Spark development environment for k8s

Data Version Control or DVC is an open-source tool for data science and machine learning projects

Bonsai: Gradient Boosted Trees + Bayesian Optimization

Simple data balancing baselines for worst-group-accuracy benchmarks.

Mixing up the Invariant Information clustering architecture, with self supervised concepts from SimCLR and MoCo approaches

List of Data Science Cheatsheets to rule the world

Crunchdao - Python API for the Crunchdao machine learning tournament

Binary Classification Problem with Machine Learning

End to End toy example of MLOps

AtsPy: Automated Time Series Models in Python (by @firmai)

(3D): LeGO-LOAM, LIO-SAM, and LVI-SAM installation and application

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Add built-in support for quaternions to numpy

李航《统计学习方法》复现