This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

Overview

Logo of FEDOT framework

package
tests
docs Documentation Status
license
Supported Python Versions
stats
downloads_stats
support

This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML). It can build custom modeling pipelines for different real-world processes in an automated way using an evolutionary approach. FEDOT supports classification (binary and multiclass), regression, clustering, and time series prediction tasks.

The structure of the modeling pipeline that can be optimised by FEDOT

The main feature of the framework is the complex management of interactions between various blocks of pipelines. First of all, this includes the stage of machine learning model design. FEDOT allows you to not just choose the best type of the model, but to create a complex (composite) model. It allows you to combine several models of different complexity, which helps you to achieve better modeling quality than when using any of these models separately. Within the framework, we describe composite models in the form of a graph defining the connections between data preprocessing blocks and model blocks.

The framework is not limited to specific AutoML tasks (such as pre-processing of input data, feature selection, or optimization of model hyperparameters), but allows you to solve a more general structural learning problem - for a given data set, a solution is built in the form of a graph (DAG), the nodes of which are represented by ML models, pre-processing procedures, and data transformation.

The project is maintained by the research team of the Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.

The intro video about Fedot is available here:

Introducing Fedot

FEDOT features

The main features of the framework are as follows:

  • The FEDOT architecture is highly flexible and therefore the framework can be used to automate the creation of mathematical models for various problems, types of data, and models;
  • FEDOT already supports popular ML libraries (scikit-learn, keras, statsmodels, etc.), but you can also integrate custom tools into the framework if necessary;
  • Pipeline optimization algorithms are not tied to specific data types or tasks, but you can use special templates for a specific task class or data type (time series forecasting, NLP, tabular data, etc.) to increase the efficiency;
  • The framework is not limited only to machine learning, it is possible to embed models related to specific areas into pipelines (for example, models in ODE or PDE);
  • Additional methods for hyperparameters tuning can be seamlessly integrated into FEDOT (in addition to those already supported);
  • The resulting pipelines can be exported in a human-readable JSON format, which allows you to achieve reproducibility of the experiments.

Thus, compared to other frameworks, FEDOT:

  • Is not limited to specific modeling tasks and claims versatility and expandability;
  • Allows managing the complexity of models and thereby achieving better results.
  • Allows building models using input data of various nature (texts, images, tables, etc.) and consisting of different types of models.

Installation

Common installation:

$ pip install fedot

In order to work with FEDOT source code:

$ git clone https://github.com/nccr-itmo/FEDOT.git
$ cd FEDOT
$ pip install -r requirements.txt
$ pytest -s test

How to use

FEDOT provides a high-level API that allows you to use its capabilities in a simple way. At the moment, the API can be used for classification and regression tasks only. But the time series forecasting and clustering support will be implemented soon (you can still solve these tasks via advanced initialization, see below). Input data must be either in NumPy arrays or CSV files.

To use the API, follow these steps:

  1. Import Fedot class
from fedot.api.main import Fedot
  1. Initialize the Fedot object and define the type of modeling problem. It provides a fit/predict interface:
  • fedot.fit runs the optimization and returns the resulting composite model;
  • fedot.predict returns the prediction for the given input data;
  • fedot.get_metrics estimates the quality of predictions using selected metrics

Numpy arrays, pandas data frames, and file paths can be used as sources of input data.

model = Fedot(problem='classification')

model.fit(features=train_data.features, target=train_data.target)
prediction = model.predict(features=test_data.features)

metrics = model.get_metrics()

For more advanced approaches, please use Examples & Tutorials section.

Examples & Tutorials

Jupyter notebooks with tutorials are located in the examples repository. There you can find the following guides:

Notebooks are issued with the corresponding release versions (the default version is 'latest').

Also, external examples are available:

Extended examples:

Also, several video tutorials are available (in Russian).

Publications about FEDOT

We also published several posts and news devoted to the different aspects of the framework:

In English:

In Russian:

  • General concepts of evolutionary design for composite pipelines - habr.com
  • Automated time series forecasting with FEDOT - habr.com
  • Details of FEDOT-based solution for Emergency DataHack - habr.com

Project structure

The latest stable release of FEDOT is on the master branch.

The repository includes the following directories:

  • Package core contains the main classes and scripts. It is the core of FEDOT framework
  • Package examples includes several how-to-use-cases where you can start to discover how FEDOT works
  • All unit and integration tests can be observed in the test directory
  • The sources of the documentation are in the docs

Also, you can check benchmarking a repository that was developed to provide a comparison of FEDOT against some well-known AutoML frameworks.

Current R&D and future plans

Currently, we are working on new features and trying to improve the performance and the user experience of FEDOT. The major ongoing tasks and plans:

  • Effective and ready-to-use pipeline templates for certain tasks and data types;
  • Integration with GPU via Rapids framework;
  • Alternative optimization methods of fixed-shaped pipelines;
  • Integration with MLFlow for import and export of the pipelines;
  • Improvement of high-level API.

Also, we are doing several research tasks related to AutoML time-series benchmarking and multi-modal modeling.

Any contribution is welcome. Our R&D team is open for cooperation with other scientific teams as well as with industrial partners.

Documentation

The general description is available in FEDOT.Docs repository.

Also, a detailed FEDOT API description is available in the Read the Docs.

Contribution Guide

  • The contribution guide is available in the repository.

Acknowledgments

We acknowledge the contributors for their important impact and the participants of the numerous scientific conferences and workshops for their valuable advice and suggestions.

Side projects

  • The prototype of web-GUI for FEDOT is available in FEDOT.WEB repository.

Contacts

Supported by

Citation

@article{nikitin2021automated,
title = {Automated evolutionary approach for the design of composite machine learning pipelines}, author = {Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kalyuzhnaya and Alexander Boukhanovsky}, journal = {Future Generation Computer Systems}, year = {2021}, issn = {0167-739X}, doi = {https://doi.org/10.1016/j.future.2021.08.022}}
@inproceedings{polonskaia2021multi,
title={Multi-Objective Evolutionary Design of Composite Data-Driven Models}, author={Polonskaia, Iana S. and Nikitin, Nikolay O. and Revin, Ilia and Vychuzhanin, Pavel and Kalyuzhnaya, Anna V.}, booktitle={2021 IEEE Congress on Evolutionary Computation (CEC)}, year={2021}, pages={926-933}, doi={10.1109/CEC45853.2021.9504773}}

Other papers - in ResearchGate.

Comments
  • Visualization of the operations used in pipelines

    Visualization of the operations used in pipelines

    Featuring visualization of operations used in evolutionary process. Changes:

    • Added operation_kde plot to show operations by generations.
    • Added operation_animated_bar to show operations by generations with (or without) changing fitness.
    • Modified fitness_box plot. Now it visually fits the mentioned visuals and supports pct_best parameter.

    Examples of visuals: KDE kde_best_20

    Animated barplot _new_test_fitness The same with explicitly hidden fitness: _new_test

    Modified fitness box fitness_box

    opened by MorrisNein 14
  • Preprocessing refactor

    Preprocessing refactor

    Changed the core architecture. Preprocessing operations now can be placed in separate nodes. Important changes:

    1. The Model abstraction is now replaced with an Operation that has two descendant classes: DataOperation class and Model class;
    2. Preprocessing operations, both simple, such as scaling, and advanced, such as lagged transformation for time series, can be used in nodes as previously models can be;
    3. New tuning (optimization of hyperparameters in nodes) is implemented - here are two classes: ChainTuner and SequentialTuner, both of them are using hyperopt library;
    4. New operator for mutation was implemented. Now it is possible to change hyperparameters in the nodes during chain composing;
    5. Now it is possible to feed different data sources to primary nodes, in particular, this functionality is used with exogenous time series, an example for which is available in examples;
    6. New low-level abstraction "Implementation" was appeared in the core. This abstraction includes custom models that are implemented in FEDOT;
    7. New operations have been implemented and added to the data_operations repository, such as feature selection, exclusion of anomalous values, smoothing of time series, and much more.
    opened by Dreamlone 13
  • + pipeline node operations cache support

    + pipeline node operations cache support

    Example showing the meaning of using an operations cache (haven't tested with multiprocessing yet): examples/advanced/pipelines_caching.py

    But...if you've got no time to wait for the results (up to 8 minutes by default), here is the result image with example_number=1, timeout=1., n_partitions=10: image

    Usages of the cache were added to: fedot/api/main.py fedot/core/composer/gp_composer/gp_composer.py fedot/core/validation/compose/metric_estimation.py fedot/api/api_utils/api_composer.py

    Class responsible for the caching: fedot/core/composer/cache.py


    Other changes contains mostly readability/style/performance fixes + full logger support.

    opened by IIaKyJIuH 12
  • Feature/pipeline explanations

    Feature/pipeline explanations

    What's new:

    1. An inner repo fedot/explainability for the corresponding experiments.
    2. Explainer abstract class, implementing an interface.
    3. SurrogateExplainer class for building surrogate explanation models. The only supported surrogate at the moment is the decision tree (for both classification and regression tasks).
    4. explain method of the Fedot class for creating explanations -- instances of Explainer successors.
    enhancement cases test 
    opened by MorrisNein 12
  • NLP init

    NLP init

    • New method InputData.from_text(), where you can pass meta_file.csv with text in it or path to directories with text files
    • New TextData class, where all the nlp utils are located. Not expected to use directly.

    Current idea is: text files -> feature extraction (make table data, not text) -> pass to model/chain

    • [x] Finish BatchLoader for creation of meta_file.csv for collections of data (images, text)

    • [x] Finish the text files -> meta_file.csv

    • [x] Add tests

    • [x] add packed data && unpacking script

    opened by BarabanovaIrina 11
  • enabled `logging_level` option

    enabled `logging_level` option

    1. Fixed Log initialization uniqueness (in collaboration with @maypink)
    2. Rid of useless logging_level_opt option (@maypink)
    3. Enabled usage whilst multiprocessing: separate file writings, log level preserving
    4. Added show_progress option to tuner
    5. Made pytest fixture that "resets" singletons before each test to sustain theirs pattern
    6. Minor fixes: caching, sortings, docstrings, logical fixes
    opened by IIaKyJIuH 10
  • 735-improving-fedot-documentation (structure)

    735-improving-fedot-documentation (structure)

    Partial improvement.

    1. Changed structure of the documentation according to this document
    2. Added docstrings to classes/functions/props and improved existing ones
    3. Created copy_doc decorator to copy docstrings for logically the same functions
    4. Improved code blocks, typings, variables names
    opened by IIaKyJIuH 10
  • Encoding bug

    Encoding bug

    • add import/export for "one_hot_encoding" operation
    • whether categories in test data contain in train data
    • fix issues with categorical expansion
    • fix testing time

    RIGHT now we have preprocessing pipeline:

    • convert values to one type in columns
    • fill missing values
    • one hot encoding for categorical

    Closed issues: #400 #399 #412

    opened by MAGLeb 10
  • Sensitivity

    Sensitivity

    Реализованы подходы к структурному анализу композитной модели. На данный момент Node можно:

    • Просто удалить с сохранением ее поддерева,
    • Затюнить,
    • Заменить на ноды:
      • кастомные (передать список моделей напрямую)
      • рандомные (передать количество нод, которые хотелось бы сгенерить)
      • иначе будут применены все модели доступные в рамках задачи.
    • Оценить чувствительность гиперпараметров моделей с использование индексов Соболя

    Это можно сделать:

    • через class NodeAnalysis, который может проанализировать 1 ноду несколькими подходами.
    • через class ChainStructureAnalysis, который может проанализировать несколько нод несколькими подходами

    В fedot.utilities.define_metric_by_task есть MetricByTask. Если метрика не указана, то берется стандартная метрика для Task, определенного внутри InputData.

    Диаграмма классов: Screenshot 2021-01-28 at 18 39 20

    Мини-туториал: https://fedot.readthedocs.io/en/latest/fedot/features/sensitivity_analysis.html

    UPD: в рамках данного pr была отключена сборка информации о покрытие кода в manual_build action.

    opened by BarabanovaIrina 10
  • Remote evaluation feature implemented for pipelines in composer

    Remote evaluation feature implemented for pipelines in composer

    What's new:

    'Remote' module:

    1. run_pipeline.py script is added. It receives the pipeline and data description as input and returns the fitted pipeline. This script is aimed to be called at the remote node
    2. ComputationalSetup class added to implement the remote fit of pipelines
    3. Batch evaluation of the pipelines added to the composer if ComputationalSetup is initialized.
    4. Logic for REST requests to a remote server with computational resources.

    Other:

    1. Indices corrected for Data. Now, the indices are preserved during transformations and not creating from scratch.
    2. New notebook added for industrial case
    3. Minor fixes in forecasting

    Known disadvantages:

    1. Now, the ComputationalSetup is DataMall-specific. The refactoring to support different strategies is expected.
    2. Сlient should be moved to external repository as exported from pypi.
    enhancement in progress 
    opened by nicl-nno 9
  • Very strict dependency requirements

    Very strict dependency requirements

    I'm very excited to try out this package, but its strict dependency requirements are giving me headaches when setting up my conda environment.

    ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    fedot 0.3.1 requires scikit-learn==0.24.1, but you have scikit-learn 0.24.2 which is incompatible.
    fedot 0.3.1 requires scikit-optimize==0.7.4, but you have scikit-optimize 0.9.dev0 which is incompatible.
    fedot 0.3.1 requires xgboost==1.0.1, but you have xgboost 1.4.2 which is incompatible.
    

    Is it possible to relax the requirements?

    dependencies 
    opened by bacalfa 9
  • Metric evaluation error: y_true and y_pred contain different number of classes

    Metric evaluation error: y_true and y_pred contain different number of classes

    The problem "Metric evaluation error: y_true and y_pred contain different number of classes 45, 85" raises for click_prediction_small dataset

    Looks like class stratification is failing.

    bug 
    opened by nicl-nno 2
  • Caching performance is worse than the one from the earlier versions

    Caching performance is worse than the one from the earlier versions

    As for FEDOTv0.6.0 performance of the cache (both for operations and for preprocessors) has deeply regressed from the previous versions. That can be seen from quality metrics on classification datasets - they're better when the cache is turned off... image

    It's important to accelerate a work of the cacher, so it would be reasonable to use it (It is enabled by default).

    opened by IIaKyJIuH 1
  • Bug with filter operation

    Bug with filter operation

    https://colab.research.google.com/drive/1MRgwB9sCrFeXRme0eGSZ0KeBRAMlnfOk?usp=sharing data: https://drive.google.com/file/d/1Msx8sUVm6jh6L-QH20aJrw47DvvKk-HM/view?usp=sharing

    Exception in run_regression_example(visualise) 22 **composer_params) 23 ---> 24 auto_model.fit(features=train, target='target') 25 prediction = auto_model.predict(features=test) 26 if visualise:

    /usr/local/lib/python3.8/dist-packages/fedot/api/main.py in fit(self, features, target, predefined_model) 181 # Final fit for obtained pipeline on full dataset 182 if self.history and not self.history.is_empty() or not self.current_pipeline.is_fitted: --> 183 self._train_pipeline_on_full_dataset(recommendations, full_train_not_preprocessed) 184 self.params.api_params['logger'].message('Final pipeline was fitted') 185 else:

    /usr/local/lib/python3.8/dist-packages/fedot/api/main.py in _train_pipeline_on_full_dataset(self, recommendations, full_train_not_preprocessed) 456 {k: v for k, v in recommendations.items() 457 if k != 'cut'}) --> 458 self.current_pipeline.fit( 459 full_train_not_preprocessed, 460 n_jobs=self.params.api_params['n_jobs'],

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/pipeline.py in fit(self, input_data, time_constraint, n_jobs) 139 140 if time_constraint is None: --> 141 train_predicted = self._fit(input_data=copied_input_data) 142 else: 143 train_predicted = self._fit_with_time_limit(input_data=copied_input_data, time=time_constraint)

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/pipeline.py in _fit(self, input_data, process_state_dict, fitted_operations) 102 with Timer() as t: 103 computation_time_update = not self.root_node.fitted_operation or self.computation_time is None --> 104 train_predicted = self.root_node.fit(input_data=input_data) 105 if computation_time_update: 106 self.computation_time = round(t.minutes_from_start, 3)

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data, **kwargs) 389 self.log.debug(f'Trying to fit secondary node with operation: {self.operation}') 390 --> 391 secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit') 392 393 return super().fit(input_data=secondary_input)

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in _input_from_parents(self, input_data, parent_operation) 429 parent_nodes = self._nodes_from_with_fixed_order() 430 --> 431 parent_results, _ = _combine_parents(parent_nodes, input_data, 432 parent_operation) 433

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in _combine_parents(parent_nodes, input_data, parent_operation) 471 parent_results.append(prediction) 472 elif parent_operation == 'fit': --> 473 prediction = parent.fit(input_data=input_data) 474 parent_results.append(prediction) 475 else:

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data, **kwargs) 301 else: 302 self.node_data = input_data --> 303 return super().fit(input_data) 304 305 def unfit(self):

    /usr/local/lib/python3.8/dist-packages/fedot/core/pipelines/node.py in fit(self, input_data) 179 self.fit_time_in_seconds = round(t.seconds_from_start, 3) 180 else: --> 181 operation_predict = self.operation.predict_for_fit(fitted_operation=self.fitted_operation, 182 data=input_data, 183 params=self._parameters)

    /usr/local/lib/python3.8/dist-packages/fedot/core/operations/operation.py in predict_for_fit(self, fitted_operation, data, params, output_mode) 112 for example, is the operation predict probabilities or class labels 113 """ --> 114 return self._predict(fitted_operation, data, params, output_mode, is_fit_stage=True) 115 116 def _predict(self, fitted_operation, data: InputData, params: Optional[OperationParameters] = None,

    /usr/local/lib/python3.8/dist-packages/fedot/core/operations/operation.py in _predict(self, fitted_operation, data, params, output_mode, is_fit_stage) 122 123 if is_fit_stage: --> 124 prediction = self._eval_strategy.predict_for_fit( 125 trained_operation=fitted_operation, 126 predict_data=data)

    /usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/regression.py in predict_for_fit(self, trained_operation, predict_data) 84 :return: 85 """ ---> 86 prediction = trained_operation.transform_for_fit(predict_data) 87 converted = self._convert_to_output(prediction, predict_data) 88 return converted

    /usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/operation_implementations/data_operations/sklearn_filters.py in transform_for_fit(self, input_data) 59 mask = self.operation.inlier_mask_ 60 if mask is not None: ---> 61 input_data = update_data(input_data, mask) 62 else: 63 self.log.info("Filtering Algorithm: didn't fit correctly. Return all objects")

    /usr/local/lib/python3.8/dist-packages/fedot/core/operations/evaluation/operation_implementations/data_operations/sklearn_filters.py in update_data(input_data, mask) 231 old_idx = modified_input_data.idx 232 --> 233 modified_input_data.features = old_features[mask] 234 modified_input_data.target = old_target[mask] 235 modified_input_data.idx = np.array(old_idx)[mask]

    IndexError: boolean index did not match indexed array along dimension 0; dimension is 68 but corresponding boolean dimension is 55

    bug 
    opened by valer1435 0
  • Proposed method of installation on MAC M1 and GPU utilisation

    Proposed method of installation on MAC M1 and GPU utilisation

    What is the proposed method of installation in Mac silicon M1?

    After a lot of trials, my installation worked with the following set of commands on a x86 architecture conda env

    in new python env:

    conda config --env --set subdir osx-64 conda install python=3.8.13 pip install fedot brew install libomp conda install -c conda-forge lightgbm

    Also, can't seem to get any GPU activity, is there any config script I can include in code for M1?

    opened by gdamianakos 1
Releases(v0.6.1)
  • v0.6.1(Dec 12, 2022)

    Hi, folk! We're making a new minor release with a number of improvements. This is an important release in a sense that this is a last release of self-contained FEDOT. The next major release will mark a separation of the optimizer core into the separate project.

    New features, better quality & changes in API

    • More intuitive predict interface for time series forecasting (#930)
    • Pipeline save/load now have more intuitive behavior (#971)
    • Early stopping criteria now can take timeout into considerations, and not only number of iterations (early_stopping_timeout api parameter)
    • Graph nodes now can be accessed by name or uid (#982)
    • Tuner speed is better due to better initial params in the search space (#985)

    Enhancements and fixes:

    • Fix inplace modification of data during data definition (resolves #943)
    • Fix regression preprocessing (#955)
    • Less evaluation errors during population selection in corner cases (#956)
    • Fix getting suitable operations for multi ts (#981)
    • Integration tests are fixed & passing now
    • More minor fixes & minor class interface refactorings
    • Important fix for multi-objective optimization (#996)

    Documentation is extended

    Architectural refactorings are continued:

    • Better PipelineAdapter (#941)
    • Abstracting optimiser core (most tasks in issue #713 are done) Notably, Serializer subsystem is now extendable (#969)
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Oct 18, 2022)

    Hi everyone! We released a new major version of FEDOT - 0.6.0

    It includes a lot of major changes:

    • Improvement of API for multi-modal datasets and models;
    • New PipelineBuilder (#597) – that simplifies manual construction of ML Pipelines;
    • Joblib was embedded as a multiprocessing backend (#843). Data exchange between processes minimized (#926);
    • Embedding stratify k fold strategy for cases with imbalance data;
    • New visualization of graphs, pipelines and optimisation history;

    Also, this release contains by a lot of architectural refactorings of the framework:

    • New Graph Adapter subsystem (#876);
    • Merging two different implementation of evolutionary optimizer (parameter-free & usual) into one EvoGraphOptimizer (#687)
    • Architectural refactorings of the Graph hierarchy (#750)
    • Introduce notions of Objective & Fitness (#654) – classes that substitutes simple float metric values & abstract single vs. multi-objective metrics
    • Refactored parameter classes – for more intuitive segregation of different parameters controlling optimization process (#852)
    • Refactored DataMerger facility
    • Refactoring of selection operator implementation (#918)

    Also, there are various bug-fixes related to ML operations, evolutionary operators & internal Graph operations.

    Source code(tar.gz)
    Source code(zip)
  • v0.5.1(Feb 22, 2022)

    The most important changes:

    • Cache support for the cross-validation implemented;
    • AutoML can be run without a time limit;
    • Graph operators improved;
    • Multi-task pipelines processing improved;
    • Custom parameters support for external optimizer;
    • Time series processing improved;
    • Multimodal table processing improved;
    • Lightweight docker prepared;
    • Isolation Forest added as new operation
    • Major and minor bugs are fixed.
    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Dec 31, 2021)

    Hi everyone!

    We released a new major version of FEDOT - 0.5.0 It includes several major changes

    The new version is available and can be imported via pip: pip install fedot==0.5.0

    The most important changes:

    • Preprocessing for tabular features and target variables improved dramatically.
    • API is refactored and improved (presets, parameters, etc). The postfix "_tun" has been removed from the presets, so now you have to specify composer_params={'with_tuning': True} to set tuning. Important changes in preset names: light - best_quality, ultra_light - fast_train.
    • Support for external optimisers is implemented.
    • Zero-code console interface is implemented.
    • Surrogate decision trees for pipeline interpretation are added.
    • Custom model support is implemented.
    • Better presets and models for time series forecasting (derivatives, polynomial models, cuts, etc)
    • Better integration with FEDOT.Web
    • Prototype for remote infrastructure support
    • Evolutionary optimiser improved (stopping criterion, progress bar, better mutations, etc)
    • Major and minor bugs are fixed.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Oct 8, 2021)

    We released a new major version of FEDOT - 0.4.1 It includes several large changes, features, and fixes.

    The new version is available and can be imported via pip: pip install fedot==0.4.1

    The most important changes:

    • Major bugs fixed for evolutionary composing: we get rid of many annoying problems related to fitness evaluation and mutations
    • Multi-variate time series forecasting improved
    • Torch-based LSTM model added
    • Encoding stage for categorial features implemented
    • Docker containers updated, GPU example improved
    • Export of pipelines improved
    • Processing of hyperparameters improved
    • API refactored
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Aug 18, 2021)

    We released a new major version of FEDOT - 0.4.0 It includes several large changes, features, and fixes.

    The new version is available and can be imported via pip: pip install fedot==0.4.0

    The most important changes:

    Infrastructure:

    • Docker version added;
    • GPU support added;
    • Requirements become more flexible

    Optimizer:

    • Evolutionary optimizer generalized to allow the application to the custom non-ML tasks
    • Mutation schemes improved for a more explainable evolution process;
    • History saving extended

    Time series:

    • Cross-validation for the time series implemented;
    • Sparse lagged transformation for time series implemented to improve performance;

    Common:

    • API updated and simplified;
    • Processing of categorical features improved;
    • Fixes and improvements for the hyperparameters tuning;
    • The ‘chain’ term is replaced with ‘pipeline’ for better understandability.

    Utilities:

    • Sensitivity analysis improved
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(May 30, 2021)

    During the last month, we have merged several major features and fixed a bunch of bugs. Some of them are experimental and should be tested extensively in real-world cases. But we have tried our best and covered it with unit tests.

    The new version (fedot == 0.3.1) is available and can be imported via pip.

    The most important features:

    • ML pipelines for multi-modal datasets
    • Decompose operation in ML pipelines
    • Cross-validation in Composer
    • Add Memory and Time profilers
    • Memory consumption improving

    For details, see the post in repository: https://github.com/nccr-itmo/FEDOT/discussions/317

    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(May 10, 2021)

    Hello everyone!

    Our team finally has finished preparing a new major release of fedot == 0.3.0. Thanks to all dev team who was working on it! It is available and can be imported via pip: https://pypi.org/project/fedot/0.3.0/. The most important changes:

    • Extended data operations and their automatic optimization

    Previously, Fedot (Chain objects) allow one to automatically build ML pipelines including models, but data operations (like scaling or gap-filling) were embedded in the nodes and could be changed manually only. In the latest release we significantly refactored the core logic of the framework, thus data operations are fully supported as separate nodes. It can extend the overall search space of a suitable ML pipeline.

    • New AutoML for time-series forecasting

    Now Fedot supports not the only manual building of ML pipelines for time-series forecasting but also in an automated mode via Composer! Fedot allow one to build pipelines and forecast time-series for a given window size and forecasting length. Also, it is possible to use exogen variables for forecasting. To check all features, see examplesin the repository.

    Our early studies showed it is a promising approach that can improve AutoML field for time-series. We are actively working on the benchmarking of well-known SOTA frameworks for time-series forecasting and novel results will be published in a near future. Also, you can check our fresh preprint about gap filling in time-series using Fedot framework.

    • Black-box optimization of ML pipeline hyperparameters

    During the experiments, we found out that our previous version of tuning of hyperparameters seems to be ineffective (also it didn't work out for preprocessing nodes). Therefore, we significantly refactored the tuning module and it provides several schemas for black-box optimization of ML pipelines hyperparameters. For details, check tuning module sources and the examples.

    • Multi-Objective AutoML for pipelines

    Several months ago during the team discussion, we formulate a hypothesis: "Most of the AutoML frameworks are trying to maximize only one metric - prediction quality. But can we optimize several metrics (like pipeline complexity, for instance) simultaneously?" So we made research where evolutionary multi-objective optimization algorithms (like NSGA-II, SPEA-2) were adapted to the AutoML task. And it was concluded that it is a promising feature and we have integrated it into Fedot. The preprint is available, but also you can check the example how to use multi-objective optimization via Fedot API.

    • New input data support for image classification

    Later, we have announced that images will be supported in Fedot. And we made several changes in InputData and now pipelines for image classification can be built manually. We also added several CNN architectures and example of its usage. Composer should also work for image classification but we have not tested extensively this functionality yet.

    Also, we have fixed a bunch of bugs and improved Fedot API.

    Thanks to everyone who is following our progress! Any issues and user reports are welcomed. Cya!

    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Mar 12, 2021)

    Greetings to everyone who follows our team and FEDOT development progress!

    Today, we released a new version of fedot == 0.2.1.

    Here is the list of the main changes:

    • Main API is updated. The basic 'how-to-use is available in the https://github.com/nccr-itmo/FEDOT/blob/master/notebooks/intro_to_automl.ipynb
    • Support of the pandas dataframes is added
    • Logging is improved
    • The sensitivity analysis of the composite model (chain) structure is added. The description is available in https://fedot.readthedocs.io/en/latest/fedot/features/sensitivity_analysis.html

    New version can be obtained using pip install fedot == 0.2.1

    Оur team is very interested in any user feedback, the new issues are extremely welcomed! Thank you!

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jan 21, 2021)

    Greetings to everyone who follows our team and FEDOT development progress!

    Last week, we released a new version of fedot == 0.2.0. A bunch of bugs in framework were fixed and merged to master (main) and release branches. Here is the list of the main changes:

    • NLP tasks are now supported, a simple example of text classification were added (see here)
    • The first version of fedot high-level API were implemented, see readme for the instructions
    • Fixed several bugs with chain import/export
    • Composer now should work correctly for time-series task
    • Embedded visualization of composing and the resulted chains were improved, see the example here
    Source code(tar.gz)
    Source code(zip)
Owner
National Center for Cognitive Research of ITMO University
National Center for Cognitive Research of ITMO University
Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

Neelesh C A 3 Oct 14, 2022
HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps. 中文介绍 Features Non-intrusive. Your iOS project does not need to be modi

mao2020 47 Oct 22, 2022
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 161 Dec 08, 2022
Code implementation for the paper 'Conditional Gaussian PAC-Bayes'.

CondGauss This repository contains PyTorch code for the paper Stochastic Gaussian PAC-Bayes. A novel PAC-Bayesian training method is implemented. Ther

0 Nov 01, 2021
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Machine-Intelligence-Lab-CS305 The compilation of all the code an drelated files from MI-Lab UE19CS305 (of batch 2019-2023) offered by PES University

Arvind Krishna 3 Nov 10, 2022
Human Dynamics from Monocular Video with Dynamic Camera Movements

Human Dynamics from Monocular Video with Dynamic Camera Movements Ri Yu, Hwangpil Park and Jehee Lee Seoul National University ACM Transactions on Gra

215 Jan 01, 2023
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Rex Cheng 364 Jan 03, 2023
Python 3 module to print out long strings of text with intervals of time inbetween

Python-Fastprint Python 3 module to print out long strings of text with intervals of time inbetween Install: pip install fastprint Sync Usage: from fa

Kainoa Kanter 2 Jun 27, 2022
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。

说明 本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。 python依赖 tf2.3 、cv2、numpy、pyqt5 pyqt5安装 pip install PyQt5 pip install PyQt5-tools 使用 程

4 May 04, 2022
object recognition with machine learning on Respberry pi

Respberrypi_object-recognition object recognition with machine learning on Respberry pi line.py 建立一支與樹梅派連線的 linebot 使用此 linebot 遠端控制樹梅派拍照 config.ini l

1 Dec 11, 2021
Official repository of Semantic Image Matting

Semantic Image Matting This is the official repository of Semantic Image Matting (CVPR2021). Overview Natural image matting separates the foreground f

192 Dec 29, 2022
Curating a dataset for bioimage transfer learning

CytoImageNet A large-scale pretraining dataset for bioimage transfer learning. Motivation In past few decades, the increase in speed of data collectio

Stanley Z. Hua 9 Jun 20, 2022
This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

AB-TRAP: building invisibility shields to protect network devices The AB-TRAP framework is applicable to the development of Network Intrusion Detectio

Lab-C2DC - Laboratory of Command and Control and Cyber-security 17 Jan 04, 2023
A Comparative Framework for Multimodal Recommender Systems

Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia

Preferred.AI 671 Jan 03, 2023
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternativ

9 Oct 18, 2022
Implementation of Artificial Neural Network Algorithm

Artificial Neural Network This repository contain implementation of Artificial Neural Network Algorithm in several programming languanges and framewor

Resha Dwika Hefni Al-Fahsi 1 Sep 14, 2022
Implementation of Kronecker Attention in Pytorch

Kronecker Attention Pytorch Implementation of Kronecker Attention in Pytorch. Results look less than stellar, but if someone found some context where

Phil Wang 16 May 06, 2022