Traingenerator 🧙 A web app to generate template code for machine learning ✨

Overview

Traingenerator

🧙   A web app to generate template code for machine learning

Gitter Heroku Code style: black



🎉 Traingenerator is now live! 🎉

Try it out:
https://traingenerator.jrieke.com


Generate custom template code for PyTorch & sklearn, using a simple web UI built with streamlit. Traingenerator offers multiple options for preprocessing, model setup, training, and visualization (using Tensorboard or comet.ml). It exports to .py, Jupyter Notebook, or Google Colab. The perfect tool to jumpstart your next machine learning project!


For updates, follow me on Twitter, and if you like this project, please consider sponsoring




Adding new templates

You can add your own template in 4 easy steps (see below), without changing any code in the app itself. Your new template will be automatically discovered by Traingenerator and shown in the sidebar. That's it! 🎈

Want to share your magic? 🧙 PRs are welcome! Please have a look at CONTRIBUTING.md and write on Gitter.

Some ideas for new templates: Keras/Tensorflow, Pytorch Lightning, object detection, segmentation, text classification, ...

  1. Create a folder under ./templates. The folder name should be the task that your template solves (e.g. Image classification). Optionally, you can add a framework name (e.g. Image classification_PyTorch). Both names are automatically shown in the first two dropdowns in the sidebar (see image). Tip: Copy the example template to get started more quickly.
  2. Add a file sidebar.py to the folder (see example). It needs to contain a method show(), which displays all template-specific streamlit components in the sidebar (i.e. everything below Task) and returns a dictionary of user inputs.
  3. Add a file code-template.py.jinja to the folder (see example). This Jinja2 template is used to generate the code. You can write normal Python code in it and modify it (through Jinja) based on the user inputs in the sidebar (e.g. insert a parameter value from the sidebar or show different code parts based on the user's selection).
  4. Optional: Add a file test-inputs.yml to the folder (see example). This simple YAML file should define a few possible user inputs that can be used for testing. If you run pytest (see below), it will automatically pick up this file, render the code template with its values, and check that the generated code runs without errors. This file is optional – but it's required if you want to contribute your template to this repo.

Installation

Note: You only need to install Traingenerator if you want to contribute or run it locally. If you just want to use it, go here.

git clone https://github.com/jrieke/traingenerator.git
cd traingenerator
pip install -r requirements.txt

Optional: For the "Open in Colab" button to work you need to set up a Github repo where the notebook files can be stored (Colab can only open public files if they are on Github). After setting up the repo, create a file .env with content:

GITHUB_TOKEN=<your-github-access-token>
REPO_NAME=<user/notebooks-repo>

If you don't set this up, the app will still work but the "Open in Colab" button will only show an error message.

Running locally

streamlit run app/main.py

Make sure to run always from the traingenerator dir (not from the app dir), otherwise the app will not be able to find the templates.

Deploying to Heroku

First, install heroku and login. To create a new deployment, run inside traingenerator:

heroku create
git push heroku main
heroku open

To update the deployed app, commit your changes and run:

git push heroku main

Optional: If you set up a Github repo to enable the "Open in Colab" button (see above), you also need to run:

heroku config:set GITHUB_TOKEN=
   
    
heroku config:set REPO_NAME=
    

    
   

Testing

First, install pytest and required plugins via:

pip install -r requirements-dev.txt

To run all tests:

pytest ./tests

Note that this only tests the code templates (i.e. it renders them with different input values and makes sure that the code executes without error). The streamlit app itself is not tested at the moment.

You can also test an individual template by passing the name of the template dir to --template, e.g.:

pytest ./tests --template "Image classification_scikit-learn"

The mage image used in Traingenerator is from Twitter's Twemoji library and released under Creative Commons Attribution 4.0 International Public License.

Owner
Johannes Rieke
Product manager dev experience @streamlit
Johannes Rieke
Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm.

Naive-Bayes Spam Classificator Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm. Main goal is to code a

Viktoria Maksymiuk 1 Jun 27, 2022
ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

ArviZ 1.3k Jan 05, 2023
Open MLOps - A Production-focused Open-Source Machine Learning Framework

Open MLOps - A Production-focused Open-Source Machine Learning Framework Open MLOps is a set of open-source tools carefully chosen to ease user experi

Data Revenue 590 Dec 28, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
Cohort Intelligence used to solve various mathematical functions

Cohort-Intelligence-for-Mathematical-Functions About Cohort Intelligence : Cohort Intelligence ( CI ) is an optimization technique. It attempts to mod

Aayush Khandekar 2 Oct 25, 2021
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Seldon Core: Blazing Fast, Industry-Ready ML An open source platform to deploy your machine learning models on Kubernetes at massive scale. Overview S

Seldon 3.5k Jan 01, 2023
Getting Profit and Loss Make Easy From Binance

Getting Profit and Loss Make Easy From Binance I have been in Binance Automated Trading for some time and have generated a lot of transaction records,

17 Dec 21, 2022
CobraML: Completely Customizable A python ML library designed to give the end user full control

CobraML: Completely Customizable What is it? CobraML is a python library built on both numpy and numba. Unlike other ML libraries CobraML gives the us

Sriram Govindan 14 Dec 19, 2021
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Miles Cranmer 924 Jan 03, 2023
Microsoft 5.6k Jan 07, 2023
Self Organising Map (SOM) for clustering of atomistic samples through unsupervised learning.

Self Organising Map for Clustering of Atomistic Samples - V2 Description Self Organising Map (also known as Kohonen Network) implemented in Python for

Franco Aquistapace 0 Nov 16, 2021
Data Version Control or DVC is an open-source tool for data science and machine learning projects

Continuous Machine Learning project integration with DVC Data Version Control or DVC is an open-source tool for data science and machine learning proj

Azaria Gebremichael 2 Jul 29, 2021
Kaggle Competition using 15 numerical predictors to predict a continuous outcome.

Kaggle-Comp.-Data-Mining Kaggle Competition using 15 numerical predictors to predict a continuous outcome as part of a final project for a stats data

moisey alaev 1 Dec 28, 2021
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 07, 2023
Markov bot - A Writing bot based on Markov Chain for Data Structure Lab

基于马尔可夫链的写作机器人 前端 用html/css完成 Demo展示(已给出文本的相应展示) 用户提供相关的语料库后训练的成果 后端 要完成的几个接口 解析文

DysprosiumDy 9 May 05, 2022
Bayesian Modeling and Computation in Python

Bayesian Modeling and Computation in Python Open access and Code This repository contains the open access version of the text and the code examples in

Bayesian Modeling and Computation in Python 339 Jan 02, 2023
SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the S

Amazon Web Services 1.8k Jan 01, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 05, 2023
using Machine Learning Algorithm to classification AppleStore application

AppleStore-classification-with-Machine-learning-Algo- using Machine Learning Algorithm to classification AppleStore application. the first step : 1: p

Mohammed Hussien 2 May 02, 2022
distfit - Probability density fitting

Python package for probability density function fitting of univariate distributions of non-censored data

Erdogan Taskesen 187 Dec 30, 2022