Feature engineering library that helps you keep track of feature dependencies, documentation and schema

Overview

featureclass

Feature engineering library that helps you keep track of feature dependencies, documentation and schema

This library helps define a featureclass.
featureclass is inspired by dataclass, and is meant to provide alternative way to define features engineering classes.

I have noticed that the below code is pretty common when doing feature engineering:

from statistics import variance
from math import sqrt
class MyFeatures:
    def calc_all(self, datapoint):
        out = {}
        out['var'] = self.calc_var(datapoint),
        out['stdev'] = self.calc_std(out['var'])
        return out
        
    def calc_var(self, data) -> float:
        return variance(data)

    def calc_stdev(self, var) -> float:
        return sqrt(var)

Some things were missing for me from this type of implementation:

  1. Implicit dependencies between features
  2. No simple schema
  3. No documentation for features
  4. Duplicate declaration of the same feature - once as a function and one as a dict key

This is why I created this library.
I turned the above code into this:

float: """Calc stdev""" return sqrt(self.var) print(feature_names(MyFeatures)) # ('var', 'stdev') print(feature_annotations(MyFeatures)) # {'var': float, 'stdev': float} print(asDict(MyFeatures([1,2,3,4,5]))) # {'var': 2.5, 'stdev': 1.5811388300841898} print(asDataclass(MyFeatures([1,2,3,4,5]))) # MyFeatures(stdev=1.5811388300841898, var=2.5) ">
from featureclass import feature, featureclass, feature_names, feature_annotations, asDict, asDataclass
from statistics import variance
from math import sqrt

@featureclass
class MyFeatures:
    def __init__(self, datapoint):
        self.datapoint = datapoint
    
    @feature()
    def var(self) -> float:
        """Calc variance"""
        return variance(self.datapoint)

    @feature()
    def stdev(self) -> float:
        """Calc stdev"""
        return sqrt(self.var)

print(feature_names(MyFeatures)) # ('var', 'stdev')
print(feature_annotations(MyFeatures)) # {'var': float, 'stdev': float}
print(asDict(MyFeatures([1,2,3,4,5]))) # {'var': 2.5, 'stdev': 1.5811388300841898}
print(asDataclass(MyFeatures([1,2,3,4,5]))) # MyFeatures(stdev=1.5811388300841898, var=2.5)

The feature decorator is using cached_property to cache the feature calculation,
making sure that each feature is calculated once per datapoint

You might also like...
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

Bazel rules to install Python dependencies with Poetry

rules_python_poetry Bazel rules to install Python dependencies from a Poetry project. Works with native Python rules for Bazel. Getting started Add th

An assistant to guess your pip dependencies from your code, without using a requirements file.

Pip Sala Bim is an assistant to guess your pip dependencies from your code, without using a requirements file. Pip Sala Bim will tell you which packag

Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it.
Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it.

Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it. This repo will help you make a webserver with a bit of console controls.

A Puzzle A Day Keep the Work Away

A Puzzle A Day Keep the Work Away No moyu again!

Keep your company's passwords behind the firewall

TeamVault TeamVault is an open-source web-based shared password manager for behind-the-firewall installation. It requires Python 3.3+ and Postgres (wi

A 100% python file organizer. Keep your computer always organized!

PythonOrganizer A 100% python file organizer. Keep your computer always organized! To run the project, just clone the folder and run the installation

School helper, helps you at your pyllabus's.
School helper, helps you at your pyllabus's.

pyllabus, helps you at your syllabus's... WARNING: It won't run without config.py! You should add config.py yourself, it will include your APIKEY. e.g

Ssma is a tool that helps you collect your badges in a satr platform
Ssma is a tool that helps you collect your badges in a satr platform

satr-statistics-maker ssma is a tool that helps you collect your badges in a satr platform 🎖️ Requirements python = 3.7 Installation first clone the

Releases(0.3.0)
  • 0.3.0(Jan 19, 2022)

    What's Changed

      • rename asDict to as_dict and asDataclass to as_dataclass by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/4

    Full Changelog: https://github.com/Itayazolay/featureclass/compare/0.2.1...0.3.0

    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Jan 11, 2022)

    What's Changed

    • docs, formatting and tests by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/1
    • fine-tune mypy type ignore by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/3

    New Contributors

    • @Itayazolay made their first contribution in https://github.com/Itayazolay/featureclass/pull/1

    Full Changelog: https://github.com/Itayazolay/featureclass/compare/0.2.0...0.2.1

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jan 9, 2022)

This is a menu driven Railway Reservation Project which is mainly based on the python-mysql connectivity.

Online-Railway-Reservation-System This is a menu driven Railway Reservation Project which is mainly based on the python-mysql connectivity. The projec

Ananya Gupta 1 Jan 09, 2022
Cairo-bloom - A naive bloom filter implementation in Cairo

🥀 cairo-bloom A naive bloom filter implementation in Cairo. A Bloom filter is a

Sam Barnes 37 Oct 01, 2022
flake8 plugin which forbids match statements (PEP 634)

flake8-match flake8 plugin which forbids match statements (PEP 634)

Anthony Sottile 25 Nov 01, 2022
A wrapper around the python Tkinter library for customizable and modern ui-elements in Tkinter

CustomTkinter With CustomTkinter you can create modern looking user interfaces in python with tkinter. CustomTkinter is a tkinter extension which prov

4.9k Jan 02, 2023
IOP Support for Python (Experimental)

TAGS Experimental IOP Framework for Python WARNING: Currently, this project has NO EXCEPTION HANDLING. USE AT YOUR OWN RISK! I. Introduction to Interf

1 Oct 22, 2021
Time python - Códigos para auxiliar e mostrar formas de como fazer um relógio e manipular o seu tempo

Time_python Códigos para auxiliar e mostrar formas de como fazer um relógio e manipular o seu tempo. Bibliotecas Nestes foram usadas bibliotecas nativ

Eduardo Henrique 1 Jan 03, 2022
AlexaUsingPython - Alexa will pay attention to your order, as: Hello Alexa, play music, Hello Alexa

AlexaUsingPython - Alexa will pay attention to your order, as: Hello Alexa, play music, Hello Alexa, what's the time? Alexa will pay attention to your order, get it, and afterward do some activity as

Abubakar Sattar 10 Aug 18, 2022
Moji sends text and fun facts from different APIs wit da use of a notification deamon

Moji sends text and fun facts from different APIs wit da use of a notification deamon. Can be runned via dmenu or rofi.

kshly 2 Jan 12, 2022
Tucan Discord Token Generator - Remastered

TucanGEN-SRC Tucan Discord Token Generator - Remastered Tucan source made better by me. -- idk if it works anymore Includes: hCaptcha Bypass Automatic

Vast 8 Nov 04, 2022
Edorado93 - Unraveling a Rockstar! -- Too much? Fine, Unraveling a humble programmer then?

Hi, I'm Sachin Malhotra ( ⛄ 💻 🎃 🍺 ) Let me set the records straight. Roger Federer is the GOAT and I will not hear otherwise! Now that we have that

Sachin Malhotra 7 Dec 25, 2022
Convert .1pux to .csv

1PasswordConverter Convert .1pux to .csv 1Password uses this new export format .1pux, I assume stands for 1 Password User eXport. As of right now, 1Pa

Shayne Hartford 7 Dec 16, 2022
Bitflip Fault Simulation Platform by Daniele Rizzieri (2021)

SEE Injection Framework 2021 This repository contains two Single Event Effect (SEE) injection platforms. The first one is called BFSP - "Bitflip Fault

Daniele Rizzieri 2 Nov 05, 2022
Labspy06 With Python

Labspy06 Profil Nama : Nafal mumtaz fuadi Nim : 312110457 Kelas : T1.21.A.2 Latihan 1 Ubahlah kode dibawah ini menjadi fungsi menggunakan lambda impor

Mas Nafal 1 Dec 12, 2021
Nmap script to detect a Microsoft Exchange instance version with OWA enabled.

Nmap script to detect a Microsoft Exchange instance version with OWA enabled.

Luciano Righetti 27 Nov 17, 2022
Yet another Python Implementation of the Elo rating system.

Python Implementation - Elo Rating System Yet another Python Implementation of the Elo rating system (how innovative am I right?). Only supports 1vs1

Kraktoos 5 Dec 22, 2022
OCR-ID-Card VietNamese (new id-card)

OCR-ID-Card VietNamese (new id-card) run project: download 2 file weights and pu

12 Jun 15, 2022
Collections of python projects

nppy, mostly contains projects written in Python. Some projects are very simple while some are a bit lenghty and difficult(for beginners) Requirements

ghanteyyy 75 Dec 20, 2022
Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.

This readme will introduce you to Carteblanche and walk you through an example app, please refer to carteblanche-django-starter for the full example p

Eric Neuman 42 May 28, 2021
Learning with Peter Norvig's lis.py interpreter

Learning with lis.py This repository contains variations of Peter Norvig's lis.py interpreter for a subset of Scheme, described in (How to Write a (Li

Fluent Python 170 Dec 15, 2022
The mock Pokemon Environment I built in 2019 to study Reinforcement Learning + Pokemon

ghetto-pokemon-rl-environment ##NOT MAINTAINED! Fork and maintain yourself. Environment I made back in 2019 to use Pokemon to practice reinforcement l

2 Dec 09, 2021