onelearn: Online learning in Python

Overview

Build Status Documentation Status PyPI - Python Version PyPI - Wheel GitHub stars GitHub issues GitHub license Coverage Status

onelearn: Online learning in Python

Documentation | Reproduce experiments |

onelearn stands for ONE-shot LEARNning. It is a small python package for online learning with Python. It provides :

  • online (or one-shot) learning algorithms: each sample is processed once, only a single pass is performed on the data
  • including multi-class classification and regression algorithms
  • For now, only ensemble methods, namely Random Forests

Installation

The easiest way to install onelearn is using pip

pip install onelearn

But you can also use the latest development from github directly with

pip install git+https://github.com/onelearn/onelearn.git

References

@article{mourtada2019amf,
  title={AMF: Aggregated Mondrian Forests for Online Learning},
  author={Mourtada, Jaouad and Ga{\"\i}ffas, St{\'e}phane and Scornet, Erwan},
  journal={arXiv preprint arXiv:1906.10529},
  year={2019}
}
Comments
  • Unable to pickle AMFClassifier.

    Unable to pickle AMFClassifier.

    I would like to save the AMFClassifier, but am unable to pickle it. I have also tried to use dill or joblib, but they also don't seem to work.

    Is there maybe another way to somehow export the AMFClassifier in any way, such that I can save it and load it in another kernel?

    Below I added a snippet of code which reproduces the error. Note that only after the partial_fit method an error occurs when pickling. When the AMFClassifier has not been fit yet, pickling happens without problems, however, exporting an empty model is pretty useless.

    Any help or tips is much appreciated.

    from onelearn import AMFClassifier
    import dill as pickle
    from sklearn import datasets
    
    
    iris = datasets.load_iris()
    X = iris.data
    y = iris.target
    
    amf = AMFClassifier(n_classes=3)
    
    dump = pickle.dumps(amf)
    amf = pickle.loads(dump)
    
    amf.partial_fit(X,y)
    
    dump = pickle.dumps(amf)
    amf = pickle.loads(dump)
    
    opened by w-feijen 1
  • Move experiments of the paper in a experiments folder

    Move experiments of the paper in a experiments folder

    • Update the documentation
    • Explain that we must clone the repo

    Move also the short experiments to a examples folder and build a sphinx gallery with it

    enhancement 
    opened by stephanegaiffas 1
  • Add some extra tests

    Add some extra tests

    • Test that batch versus online training leads to the exact same forest
    • Test the behavior of reserve_samples, with several calls to partial_fit to check that memory is correctly allocated and
    tests 
    opened by stephanegaiffas 1
  • What if predict_proba receives a single sample

    What if predict_proba receives a single sample

    get_amf_decision_online amf.partial_fit(X_train[iteration - 1], y_train[iteration - 1]) File "/Users/stephanegaiffas/Code/onelearn/onelearn/forest.py", line 259, in partial_fit n_samples, n_features = X.shape

    opened by stephanegaiffas 1
  • Improve coverage

    Improve coverage

    A problem is that @jit functions don't work with coverage... a workaround is to disable using the NUMBA_DISABLE_JIT environment variable, but breaks the code that use @jitclass and .class_type.instance_type attributes

    enhancement bug fix 
    opened by stephanegaiffas 1
Releases(v0.3)
  • v0.3(Sep 29, 2021)

    This release adds the following improvements

    • AMFClassifier and AMFRegressor can be serialized to files (using internally pickle) using the save and load methods
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 6, 2020)

    This release adds the following improvements

    • SampleCollection pre-allocates more samples instead of the bare minimum for faster computation
    • The playground can be launched from the library
    • A documentation on readthedocs
    • Faster computations and a lot of code cleaning
    • Unittests for python 3.6-3.8
    Source code(tar.gz)
    Source code(zip)
Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Payment-Date-Prediction Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

15 Sep 09, 2022
icepickle is to allow a safe way to serialize and deserialize linear scikit-learn models

icepickle It's a cooler way to store simple linear models. The goal of icepickle is to allow a safe way to serialize and deserialize linear scikit-lea

vincent d warmerdam 24 Dec 09, 2022
Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification

Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification Introduction. This package includes the pyth

5 Dec 06, 2022
TIANCHI Purchase Redemption Forecast Challenge

TIANCHI Purchase Redemption Forecast Challenge

Haorui HE 4 Aug 26, 2022
Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Bodywork deploys machine learning projects developed in Python, to Kubernetes. It helps you to: serve models as microservices execute batch jobs run r

Bodywork Machine Learning 409 Jan 01, 2023
A high-performance topological machine learning toolbox in Python

giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the G

giotto.ai 632 Dec 29, 2022
Warren - Stock Price Predictor

Web app to predict closing stock prices in real time using Facebook's Prophet time series algorithm with a multi-variate, single-step time series forecasting strategy.

Kumar Nityan Suman 153 Jan 03, 2023
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

sklearn-porter Transpile trained scikit-learn estimators to C, Java, JavaScript and others. It's recommended for limited embedded systems and critical

Darius Morawiec 1.2k Jan 05, 2023
GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Generator of Rad Names from Decent Paper Acronyms

264 Nov 08, 2022
Bonsai: Gradient Boosted Trees + Bayesian Optimization

Bonsai is a wrapper for the XGBoost and Catboost model training pipelines that leverages Bayesian optimization for computationally efficient hyperparameter tuning.

24 Oct 27, 2022
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Scott Lundberg 18.2k Jan 02, 2023
Firebase + Cloudrun + Machine learning

A simple end to end consumer lending decision engine powered by Google Cloud Platform (firebase hosting and cloudrun)

Emmanuel Ogunwede 8 Aug 16, 2022
Sequence learning toolkit for Python

seqlearn seqlearn is a sequence classification toolkit for Python. It is designed to extend scikit-learn and offer as similar as possible an API. Comp

Lars 653 Dec 27, 2022
A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al.

pyUpSet A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al. Contents Purpose How to install How it work

288 Jan 04, 2023
ETNA – time series forecasting framework

ETNA Time Series Library Predict your time series the easiest way Homepage | Documentation | Tutorials | Contribution Guide | Release Notes ETNA is an

Tinkoff.AI 675 Jan 08, 2023
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible

IMBENS: Class-imbalanced Ensemble Learning in Python Language: English | Chinese/中文 Links: Documentation | Gallery | PyPI | Changelog | Source | Downl

Zhining Liu 176 Jan 04, 2023
LiuAlgoTrader is a scalable, multi-process ML-ready framework for effective algorithmic trading

LiuAlgoTrader is a scalable, multi-process ML-ready framework for effective algorithmic trading. The framework simplify development, testing, deployment, analysis and training algo trading strategies

Amichay Oren 458 Dec 24, 2022
Factorization machines in python

Factorization Machines in Python This is a python implementation of Factorization Machines [1]. This uses stochastic gradient descent with adaptive re

Corey Lynch 892 Jan 03, 2023
A Software Framework for Neuromorphic Computing

A Software Framework for Neuromorphic Computing

Lava 338 Dec 26, 2022
A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects

KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers Documentation https://www.kxy.ai/reference/ Installation From PyPi: pip inst

KXY Technologies, Inc. 35 Jan 02, 2023