A simple and lightweight genetic algorithm for optimization of any machine learning model

Last update: Aug 10, 2022

Overview

geneticml

This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model.

Installation

Use pip to install the package from PyPI:

pip install geneticml

Usage

This package provides a easy way to create estimators and perform the optimization with genetic algorithms. The example below describe in details how to create a simulation with genetic algorithms using evolutionary approach to train a sklearn.neural_network.MLPClassifier. A full list of examples could be found here.

from geneticml.optimizers import GeneticOptimizer
from geneticml.strategy import EvolutionaryStrategy
from geneticml.algorithms import EstimatorBuilder
from metrics import metric_accuracy
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

if __name__ == "__main__":

    seed = 11412

    # Creates an estimator
    estimator = EstimatorBuilder()\
        .of(model_type=MLPClassifier)\
        .fit_with(func=fit)\
        .predict_with(func=predict)\
        .build()

    # Defines a strategy for the optimization
    strategy = EvolutionaryStrategy(
        estimator_type=estimator,
        parameters=parameters,
        retain=0.4,
        random_select=0.1,
        mutate_chance=0.2,
        max_children=2,
        random_state=seed
    )

    # Creates the optimizer
    optimizer = GeneticOptimizer(strategy=strategy)

    # Loads the data
    data = load_iris()

    # Defines the metric
    metric = metric_accuracy
    greater_is_better = True

    # Create the simulation using the optimizer and the strategy
    models = optimizer.simulate(
        data=data.data, 
        target=data.target,
        generations=generations,
        population=population,
        evaluation_function=metric,
        greater_is_better=greater_is_better,
        verbose=True
    )

The estimator is the way you define an algorithm or a class that will be used for model instantiation

estimator = EstimatorBuilder().of(model_type=MLPClassifier).fit_with(func=fit).predict_with(func=predict).build()

You need to speficy a custom fit and predict functions. These functions need to use the same signature than the below ones. This happens because the algorithm is generic and needs to know how to perform the fit and predict functions for the models.

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

Custom strategy

You can create custom strategies for the optimizers by extending the geneticml.strategy.BaseStrategy and implementing the execute(...) function.

class MyCustomStrategy(BaseStrategy):
    def __init__(self, estimator_type: Type[BaseEstimator]) -> None:
        super().__init__(estimator_type)

    def execute(self, population: List[Type[T]]) -> List[T]:
        return population

The custom strategies will allow you to create optimization strategies to archive your goals. We currently have the evolutionary strategy but you can define your own :)

Custom optimizer

You can create custom optimizers by extending the geneticml.optimizers.BaseOptimizer and implementing the simulate(...) function.

class MyCustomOptimizer(BaseOptimizer):
    def __init__(self, strategy: Type[BaseStrategy]) -> None:
        super().__init__(strategy)

    def simulate(self, data, target, verbose: bool = True) -> List[T]:
        """
        Generate a network with the genetic algorithm.

        Parameters:
            data (?): The data used to train the algorithm
            target (?): The targets used to train the algorithm
            verbose (bool): True if should verbose or False if not

        Returns:
            (List[BaseEstimator]): A list with the final population sorted by their loss
        """
        estimators = self._strategy.create_population()
        for x in estimators:
            x.fit(data, target)
            y_pred = x.predict(target)
        pass

Custom optimizers will let you define how you want your algorithm to optimize the selected strategy. You can also combine custom strategies and optimizers to archive your desire objective.

Testing

The following are the steps to create a virtual environment into a folder named "venv" and install the requirements.

# Create virtualenv
python3 -m venv venv
# activate virtualenv
source venv/bin/activate
# update packages
pip install --upgrade pip setuptools wheel
# install requirements
python setup.py install

Tests can be run with python setup.py test when the virtualenv is active.

Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide. There is also an overview on GitHub.

If you are simply looking to start working with the geneticml codebase, navigate to the GitHub "issues" tab and start looking through interesting issues. Or maybe through using geneticml you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!

Feel free to ask questions on the mailing the contributors.

Changelog

1.0.3 - Included pytorch example

1.0.2 - Minor fixes on naming

1.0.1 - README fixes

1.0.0 - First release

Comments

feature/data_sampling

We added support to run your own data sampling (e.g., imblearn.SMOTE) and use the genetic algorithms to find the best set parameters for them. Also, you can find the best set of parameters for your machine learning model at same time that find the best minority class size that maximizes the model score

opened by albarsil 0

Releases(1.0.8)

1.0.8(Mar 2, 2022)

Full Changelog: https://github.com/albarsil/geneticml/compare/1.0.7...1.0.8
Source code(tar.gz)
Source code(zip)
1.0.7(Mar 2, 2022)

Source code(tar.gz)
Source code(zip)
1.0.6(Feb 25, 2022)

Full Changelog: https://github.com/albarsil/geneticml/compare/1.0.5...1.0.6
Source code(tar.gz)
Source code(zip)
1.0.5(Feb 25, 2022)
What's Changed

feature/data_sampling by @albarsil in https://github.com/albarsil/geneticml/pull/5

New Contributors

@albarsil made their first contribution in https://github.com/albarsil/geneticml/pull/5

Full Changelog: https://github.com/albarsil/geneticml/compare/1.0.4...1.0.5
Source code(tar.gz)
Source code(zip)
1.0.4(Feb 18, 2022)

Full Changelog: https://github.com/albarsil/geneticml/commits/1.0.4
Source code(tar.gz)
Source code(zip)
1.0.3(Dec 7, 2021)

Included pytorch example
Source code(tar.gz)
Source code(zip)
1.0.2(Dec 7, 2021)

Full Changelog: https://github.com/albarsil/geneticml/compare/1.0.1...1.0.2
Source code(tar.gz)
Source code(zip)
1.0.1(Dec 7, 2021)

Full Changelog: https://github.com/albarsil/geneticml/commits/1.0.1
Source code(tar.gz)
Source code(zip)

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

84 Nov 25, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

8.1k Dec 30, 2022

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

19 Oct 3, 2022

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

3k Jan 8, 2023

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

25 Dec 28, 2022

Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku

A simple and lightweight genetic algorithm for optimization of any machine learning model

Related tags

Overview

geneticml

Installation

Usage

Custom strategy

Custom optimizer

Testing

Contributing

Changelog

You might also like...

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Python package for machine learning for healthcare using a OMOP common data model

Comments

feature/data_sampling

Releases(1.0.8)

1.0.8(Mar 2, 2022)

1.0.7(Mar 2, 2022)

1.0.6(Feb 25, 2022)

1.0.5(Feb 25, 2022)

What's Changed

New Contributors

1.0.4(Feb 18, 2022)

1.0.3(Dec 7, 2021)

1.0.2(Dec 7, 2021)

1.0.1(Dec 7, 2021)

Owner

Allan Barcelos

A toolkit for making real world machine learning and data analysis applications in C++

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

Python package for stacking (machine learning technique)

This is a curated list of medical data for machine learning

Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

A library to generate synthetic time series data by easy-to-use factors and generator

MiniTorch - a diy teaching library for machine learning engineers

In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

A data preprocessing package for time series data. Design for machine learning and deep learning.

An open-source library of algorithms to analyse time series in GPU and CPU.

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in the form of Jupyter Notebooks.

#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

pandas, scikit-learn, xgboost and seaborn integration

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

onelearn: Online learning in Python

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow

AutoX是一个高效的自动化机器学习工具，它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色、简单易用、通用、自动化、灵活。

AutoX是一个高效的自动化机器学习工具，它主要针对于表格类型的数据挖掘竞赛。它的特点包括: 效果出色、简单易用、通用、自动化、灵活。