Model factory is a ML training platform to help engineers to build ML models at scale

Last update: Sep 23, 2022

Related tags

Overview

Model Factory

Machine learning today is powering many businesses today, e.g., search engine, e-commerce, news or feed recommendation. Training high quality ML models is critical to all of these systems.

However, training a model is not trivial. Traditionally, engineers use single devvm to train models. It might be doable if you were only to build a few models. If you are interested in exploring hundreds or even thousands of ideas, repeating the workflow manually will be a painful process.

There are many issues with the above workflow:

Hard to scale
No tracking
No monitor
No end-to-end automation
Not easy to share with others
No centralized model management

The above pain points really slows engineers down when they are developing their ML models. Model factory is a project that targets at addressing the above issues.

Background

There are existing work in the industry which tries to address the above issues as well, e.g., Facebook fblearner, Google Kubeflow.

The key difference between model factory and other projects is that model factory promotes a pure python based authoring experience, while most others uses DAG (Directed Acyclic Graph). The philosophy gives model factory the following advantages:

Easy to learn: there is almost no learning curve. As long as you know how to write python, you know how to use model factory.
More flexible: control flow logic can be easily implemented on it.
Allow communication between nodes: free form communication can be done between operators, which opens up the possibility of building distributed training on top of model factory.

Installation

Please follow the Installation page to deploy model factory in your production or testing environment.

Development Guide

Please follow the Development Guide page to try out your first model factory pipeline.

Model factory is a ML training platform to help engineers to build ML models at scale

Related tags

Overview

Model Factory

Background

Installation

Development Guide

Owner

ml4h is a toolkit for machine learning on clinical data of all kinds including genetics, labs, imaging, clinical notes, and more

[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

Deep Survival Machines - Fully Parametric Survival Regression

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Binary Classification Problem with Machine Learning

XAI - An eXplainability toolbox for machine learning

2021 Machine Learning Security Evasion Competition

fMRIprep Pipeline To Machine Learning

Code for the TCAV ML interpretability project

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

Time Series Prediction with tf.contrib.timeseries

Continuously evaluated, functional, incremental, time-series forecasting

customer churn prediction prevention in telecom industry using machine learning and survival analysis

Machine Learning e Data Science com Python

A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize

Polyglot Machine Learning example for scraping similar news articles.

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training

Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

MLOps pipeline project using Amazon SageMaker Pipelines