PyPI package for scaffolding out code for decision tree models that can learn to find relationships between the attributes of an object.

Overview

Decision Tree Writer

This package allows you to train a binary classification decision tree on a list of labeled dictionaries or class instances, and then writes a new .py file with the code for the new decision tree model.

Installation

Simply run py -m pip install decision-tree-writer from the command line (Windows) or python3 -m pip install decision-tree-writer (Unix/macOS) and you're ready to go!

Usage

1) Train the model

Use the DecisionTreeWriter class to train a model on a data set and write the code to a new file in a specified fie folder (default folder is the same as your code):

from decision_tree_writer import DecisionTreeWriter

# Here we're using some of the famous iris data set for an example.
# You could alternatively make an Iris class with the same 
# attributes as the keys of each of these dictionaries.
iris_data = [
    { "species": "setosa", "sepal_length": 5.2, "sepal_width": 3.5, 
                            "petal_length": 1.5, "petal_width": 0.2},
    { "species": "setosa", "sepal_length": 5.2, "sepal_width": 4.1, 
                            "petal_length": 1.5, "petal_width": 0.1},
    { "species": "setosa", "sepal_length": 5.4, "sepal_width": 3.7, 
                            "petal_length": 1.5, "petal_width": 0.2},
    { "species": "versicolor", "sepal_length": 6.2, "sepal_width": 2.2, 
                            "petal_length": 4.5, "petal_width": 1.5},
    { "species": "versicolor", "sepal_length": 5.7, "sepal_width": 2.9, 
                            "petal_length": 4.2, "petal_width": 1.3},
    { "species": "versicolor", "sepal_length": 5.6, "sepal_width": 2.9, 
                            "petal_length": 3.6, "petal_width": 1.3},
    { "species": "virginica", "sepal_length": 7.2, "sepal_width": 3.2, 
                            "petal_length": 6.0, "petal_width": 1.8},
    { "species": "virginica", "sepal_length": 6.1, "sepal_width": 2.6, 
                            "petal_length": 5.6, "petal_width": 1.4},
    { "species": "virginica", "sepal_length": 6.8, "sepal_width": 3.0, 
                            "petal_length": 5.5, "petal_width": 2.1}
    ]

# Create the writer. 
# You must specify which attribute or key is the label of the data items.
# You can also specify the max branching depth of the tree (default [and max] is 998)
# or how many data items there must be to make a new branch (default is 1).
writer = DecisionTreeWriter(label_name="species")

# Trains a new model and saves it to a new .py file
writer.create_tree(iris_data, True, "Iris Classifier")

2) Using the new decision tree

In the specified file folder a new python file with one function will appear. It will have the name you gave your decision tree model plus a uuid to ensure it has a unique name. The generated code will look something like this:

from decision_tree_writer.BaseDecisionTree import *

# class-like syntax because it acts like it's instantiating a class.
def IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d() -> 'BaseDecisionTree':
    """
    IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d 
    has been trained to identify the species of a given object.
    """
    tree = BaseDecisionTree(None, dict,
            'IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d')
    tree.root = Branch(lambda x: x['sepal_length'] <= 5.5)
    tree.root.l = Leaf('setosa')
    tree.root.r = Branch(lambda x: x['petal_length'] <= 5.0)
    tree.root.r.l = Leaf('versicolor')
    tree.root.r.r = Leaf('virginica')
    
    return tree

Important note: if you train your model with class instance data you will have to import that class in the new file. That might look like:

from decision_tree_writer.BaseDecisionTree import *

from wherever import Iris

def IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d() -> 'BaseDecisionTree':
    tree = BaseDecisionTree(None, Iris, 
                'IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d')

Now just use the factory function to create an instance of the model. The model has two important methods, classify_one, which takes a data item of the same type as you trained the model with and returns what it thinks is the correct label for it, and classify_many, which does the same as the first but with a list of data and returns a list of labels.

Example:

tree = IrisClassifier__0c609d3a_741e_4770_8bce_df246bad054d()
print(tree.classify_one(
            { "sepal_length": 5.4, "sepal_width": 3.2, 
                "petal_length": 1.6, "petal_width": 0.3})) # output: 'setosa'

Bugs or questions

If you find any problems with this package of have any questions, please create an issue on this package's GitHub repo

You might also like...
Automatically give thanks to Pypi packages you use in your project!

Automatically give thanks to Pypi packages you use in your project!

Implements a polyglot REPL which supports multiple languages and shared meta-object protocol scope between REPLs.
Implements a polyglot REPL which supports multiple languages and shared meta-object protocol scope between REPLs.

MetaCall Polyglot REPL Description This repository implements a Polyglot REPL which shares the state of the meta-object protocol between the REPLs. Us

This program can calculate the Aerial Distance between two cities.
This program can calculate the Aerial Distance between two cities.

Aerial_Distance_Calculator This program can calculate the Aerial Distance between two cities. This repository include both Jupyter notebook and Python

The blancmange curve can be visually built up out of triangle wave functions if the infinite sum is approximated by finite sums of the first few terms.

Blancmange-curve The blancmange curve can be visually built up out of triangle wave functions if the infinite sum is approximated by finite sums of th

Hexa is an advanced browser.It can carry out all the functions present in a browser.

Hexa is an advanced browser.It can carry out all the functions present in a browser.It is coded in the language Python using the modules PyQt5 and sys mainly.It is gonna get developed more in the future.It is made specially for the students.Only 1 tab can be used while using it so that the students cant missuse the pandemic situation :)

A simple python project that can find Tangkeke in a given image.

A simple python project that can find Tangkeke in a given image. Make the real Tangkeke image as a kernel to convolute the target image. The area wher

Return-Parity-MDP - Towards Return Parity in Markov Decision Processes

Towards Return Parity in Markov Decision Processes Code for the AISTATS 2022 pap

This is an online course where you can learn and master the skill of low-level performance analysis and tuning.
This is an online course where you can learn and master the skill of low-level performance analysis and tuning.

Performance Ninja Class This is an online course where you can learn to find and fix low-level performance issues, for example CPU cache misses and br

Wisdom Tree is a concentration app i am working on.
Wisdom Tree is a concentration app i am working on.

Wisdom Tree Wisdom Tree is a tui concentration app I am working on. Inspired by the wisdom tree in Plants vs. Zombies which gives in-game tips when it

Releases(v0.5.1)
  • v0.5.1(May 23, 2022)

  • v0.4.1(Mar 27, 2022)

  • v0.3.1(Feb 24, 2022)

  • v0.2.4(Feb 1, 2022)

    The DecisionTreeWriter package as deployed on PyPI. Edit after v0.3.1: a better study of version naming revealed that this release should have been called v0.3.0, since it added a backwards-compatible API change.

    Full Changelog: https://github.com/AndreBacic/DecisionTreeWriter/compare/v0.2.3...v0.2.4

    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Dec 15, 2021)

  • v0.2.1(Dec 1, 2021)

A script to generate NFT art living on the Solana blockchain.

NFT Generator This script generates NFT art based on its desired traits with their specific rarities. It has been used to generate the full collection

Rude Golems 24 Oct 08, 2022
A multi-platform fuzzer for poking at userland binaries and servers

litefuzz A multi-platform fuzzer for poking at userland binaries and servers litefuzz intro why how it works what it does what it doesn't do support p

52 Nov 18, 2022
A inspector to be able to view and edit Qt style sheet while an application is running

Qt Style Sheet Inspector An inspector widget to view and modify the style sheet of a Qt app at runtime. Usage In order to use the inspector widget on

ESSS 46 Dec 10, 2022
An alternative app for core Armoury Crate functions.

NoROG DISCLAIMER: Use at your own risk. This is alpha-quality software. It has not been extensively tested, though I personally run it daily on my lap

12 Nov 29, 2022
Nextstrain build targeted to Omicron

About This repository analyzes viral genomes using Nextstrain to understand how SARS-CoV-2, the virus that is responsible for the COVID-19 pandemic, e

Bedford Lab 9 May 25, 2022
Wordle is fun, so let's ruin it with computers.

ruin-wordle Wordle is fun, so let's ruin it with computers. Metrics This repository assesses two metrics about each algorithm: Success: how many of th

Charles Tapley Hoyt 11 Feb 11, 2022
Age of Empires II recorded game parsing and summarization in Python 3.

mgz Age of Empires II recorded game parsing and summarization in Python 3. Supported Versions Age of Kings (.mgl) The Conquerors (.mgx) Userpatch 1.4

148 Dec 11, 2022
Research using python - Guide for development of research code (using Anaconda Python)

Guide for development of research code (using Anaconda Python) TL;DR: One time s

Ziv Yaniv 1 Feb 01, 2022
SuperMario - Python programming class ending assignment SuperMario, using pygame

SuperMario - Python programming class ending assignment SuperMario, using pygame

mars 2 Jan 04, 2022
CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, written in Python.

CaskDB - Disk based Log Structured Hash Table Store CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, w

886 Dec 27, 2022
Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.

pysle Questions? Comments? Feedback? Pronounced like 'p' + 'isle'. An interface to a pronunciation dictionary with stress markings (ISLEX - the intern

Tim 38 Dec 14, 2022
Simple calculator with random number button and dark gray theme created with PyQt6

Calculator Application Simple calculator with random number button and dark gray theme created with : PyQt6 Python 3.9.7 you can download the dark gra

Flamingo 2 Mar 07, 2022
A continuation Of Project Glow By @glowstik-yt

Project Glow Greetings, I see you have stumbled upon project glow. Project glow is an open source bot worked on by many people to create a good and sa

1 Nov 17, 2021
A place where the most basic, basic of python coding exists

python-basics A place where the most basic, basic of python coding exists As you can see, there are four folders and the best order to read is: appeti

Chuqin 2 Oct 05, 2022
KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

3 Oct 16, 2022
pyToledo is a Python library to interact with the common virtual learning environment for the Association KU Leuven (Toledo).

pyToledo pyToledo is a Python library to interact with the common virtual learning environment for the Association KU Leuven a.k.a Toledo. Motivation

Daan Vervacke 5 Jan 03, 2022
Ked interpreter built with Lex, Yacc and Python

Ked Ked is the first programming language known to hail from The People's Republic of Cork. It was first discovered and partially described by Adam Ly

Eoin O'Brien 1 Feb 08, 2022
An integrated library for checking email if it is registered on social media

An integrated library for checking email if it is registered on social media

Sidra ELEzz 13 Dec 08, 2022
Run CodeServer on Google Colab using Inlets in less than 60 secs using your own domain.

Inlets Colab Run CodeServer on Colab using Inlets in less than 60 secs using your own domain. Features Optimized for Inlets/InletsPro Use your own Cus

2 Dec 30, 2021
Run unpatched binaries on Nix/NixOS

Run unpatched binaries on Nix/NixOS

Thiago Kenji Okada 160 Jan 08, 2023