Feature engineering library that helps you keep track of feature dependencies, documentation and schema

Overview

featureclass

Feature engineering library that helps you keep track of feature dependencies, documentation and schema

This library helps define a featureclass.
featureclass is inspired by dataclass, and is meant to provide alternative way to define features engineering classes.

I have noticed that the below code is pretty common when doing feature engineering:

from statistics import variance
from math import sqrt
class MyFeatures:
    def calc_all(self, datapoint):
        out = {}
        out['var'] = self.calc_var(datapoint),
        out['stdev'] = self.calc_std(out['var'])
        return out
        
    def calc_var(self, data) -> float:
        return variance(data)

    def calc_stdev(self, var) -> float:
        return sqrt(var)

Some things were missing for me from this type of implementation:

  1. Implicit dependencies between features
  2. No simple schema
  3. No documentation for features
  4. Duplicate declaration of the same feature - once as a function and one as a dict key

This is why I created this library.
I turned the above code into this:

float: """Calc stdev""" return sqrt(self.var) print(feature_names(MyFeatures)) # ('var', 'stdev') print(feature_annotations(MyFeatures)) # {'var': float, 'stdev': float} print(asDict(MyFeatures([1,2,3,4,5]))) # {'var': 2.5, 'stdev': 1.5811388300841898} print(asDataclass(MyFeatures([1,2,3,4,5]))) # MyFeatures(stdev=1.5811388300841898, var=2.5) ">
from featureclass import feature, featureclass, feature_names, feature_annotations, asDict, asDataclass
from statistics import variance
from math import sqrt

@featureclass
class MyFeatures:
    def __init__(self, datapoint):
        self.datapoint = datapoint
    
    @feature()
    def var(self) -> float:
        """Calc variance"""
        return variance(self.datapoint)

    @feature()
    def stdev(self) -> float:
        """Calc stdev"""
        return sqrt(self.var)

print(feature_names(MyFeatures)) # ('var', 'stdev')
print(feature_annotations(MyFeatures)) # {'var': float, 'stdev': float}
print(asDict(MyFeatures([1,2,3,4,5]))) # {'var': 2.5, 'stdev': 1.5811388300841898}
print(asDataclass(MyFeatures([1,2,3,4,5]))) # MyFeatures(stdev=1.5811388300841898, var=2.5)

The feature decorator is using cached_property to cache the feature calculation,
making sure that each feature is calculated once per datapoint

You might also like...
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

Bazel rules to install Python dependencies with Poetry

rules_python_poetry Bazel rules to install Python dependencies from a Poetry project. Works with native Python rules for Bazel. Getting started Add th

An assistant to guess your pip dependencies from your code, without using a requirements file.

Pip Sala Bim is an assistant to guess your pip dependencies from your code, without using a requirements file. Pip Sala Bim will tell you which packag

Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it.
Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it.

Repls goes to sleep due to inactivity, but to keep it awake, simply host a webserver and ping it. This repo will help you make a webserver with a bit of console controls.

A Puzzle A Day Keep the Work Away

A Puzzle A Day Keep the Work Away No moyu again!

Keep your company's passwords behind the firewall

TeamVault TeamVault is an open-source web-based shared password manager for behind-the-firewall installation. It requires Python 3.3+ and Postgres (wi

A 100% python file organizer. Keep your computer always organized!

PythonOrganizer A 100% python file organizer. Keep your computer always organized! To run the project, just clone the folder and run the installation

School helper, helps you at your pyllabus's.
School helper, helps you at your pyllabus's.

pyllabus, helps you at your syllabus's... WARNING: It won't run without config.py! You should add config.py yourself, it will include your APIKEY. e.g

Ssma is a tool that helps you collect your badges in a satr platform
Ssma is a tool that helps you collect your badges in a satr platform

satr-statistics-maker ssma is a tool that helps you collect your badges in a satr platform 🎖️ Requirements python = 3.7 Installation first clone the

Releases(0.3.0)
  • 0.3.0(Jan 19, 2022)

    What's Changed

      • rename asDict to as_dict and asDataclass to as_dataclass by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/4

    Full Changelog: https://github.com/Itayazolay/featureclass/compare/0.2.1...0.3.0

    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Jan 11, 2022)

    What's Changed

    • docs, formatting and tests by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/1
    • fine-tune mypy type ignore by @Itayazolay in https://github.com/Itayazolay/featureclass/pull/3

    New Contributors

    • @Itayazolay made their first contribution in https://github.com/Itayazolay/featureclass/pull/1

    Full Changelog: https://github.com/Itayazolay/featureclass/compare/0.2.0...0.2.1

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jan 9, 2022)

A very small (15 lines of code) and beautiful fetch script (exclusively for Arch Linux).

minifetch A very small (15 lines of code) and beautiful fetch script (exclusively for Arch Linux). There are many fetch scripts out there but I wanted

16 Jul 11, 2022
Secret santa is a fun and easy way to get together with your friends and/or family with a gift for them.

Vaccine Validator Tool to validate domestic New Zealand vaccine passes Create a new virtual environment: python3 -m venv ./venv Activate virtual envi

2 Dec 06, 2021
Powerful Assistant

Delta-Assistant Hi I'm Phoenix This project is a smart assistant This is the 1.0 version of this project I am currently working on the third version o

1 Nov 17, 2021
One line Brainfuck interpreter in Python

One line Brainfuck interpreter in Python

16 Dec 21, 2022
Minimal, super readable string pattern matching for python.

simplematch Minimal, super readable string pattern matching for python. import simplematch simplematch.match("He* {planet}!", "Hello World!") {"p

Thomas Feldmann 147 Dec 01, 2022
"Cambio de monedas" Change-making problem with Python, dynamic programming best solutions,

Change-making-problem / Cambio de monedas Entendiendo el problema Dada una cantidad de dinero y una lista de denominaciones de monedas, encontrar el n

Juan Antonio Ayola Cortes 1 Dec 08, 2021
Release for Improved Denoising Diffusion Probabilistic Models

improved-diffusion This is the codebase for Improved Denoising Diffusion Probabilistic Models. Usage This section of the README walks through how to t

OpenAI 1.2k Dec 30, 2022
An Agora Python Flask token generation server

A Flask Starter Application with Login and Registration About A token generation Server using the factory pattern and Blueprints. A forked stripped do

Nii Ayi 1 Jan 21, 2022
token vesting escrow with cliff and clawback

Yearn Vesting Escrow A modified version of Curve Vesting Escrow contracts with added functionality: An escrow can have a start_date in the past.

62 Dec 08, 2022
Providing a working, flexible, easier and faster installer than the one officially provided by Arch Linux

Purpose The purpose is to bring more people to Arch Linux by providing a working, flexible, easier and faster installer than the one officially provid

André Luís 0 Nov 09, 2022
Check is a integer is even

Is Even Check if interger is even using isevenapi. https://isevenapi.xyz/ Main features: cache memoization api retry handler hide ads Install pip inst

Rosiney Gomes Pereira 45 Dec 19, 2022
Python script for converting obsidian md-file to html (recursively adds all link/images)

ObsidianToHtmlConverter I made a small python script for converting obsidian md-file to static (local) html (recursively adds all link/images) I made

47 Jan 03, 2023
Team10 backend - A service which accepts a VRM (Vehicle Registration Mark)

GreenShip - API A service which accepts a VRM (Vehicle Registration Mark) and re

3D Hack 1 Jan 21, 2022
Usando Multi Player Perceptron e Regressão Logistica para classificação de SPAM

Relatório dos procedimentos executados e resultados obtidos. Objetivos Treinar um modelo para classificação de SPAM usando o dataset train_data. Class

André Mediote 1 Feb 02, 2022
Personal Assistant Tessa

Personal Assistant Tessa Introducing our all new personal assistant Tessa..... An intelligent virtual assistant (IVA) or intelligent personal assistan

Anusha Joseph 4 Mar 08, 2022
This is the code of Python enthusiasts collection and written.

I am Python's enthusiast, like to collect Python's programs and code.

cnzb 35 Apr 18, 2022
Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

The Astropy Project 631 Jan 05, 2023
This library attempts to abstract the handling of Sigma rules in Python

This library attempts to abstract the handling of Sigma rules in Python. The rules are parsed using a schema defined with pydantic, and can be easily loaded from YAML files into a structured Python o

Caleb Stewart 44 Oct 29, 2022
Python library for the Unmand APIs.

Unmand Python SDK This is a simple package to aid in consuming the Unmand APIs. For more help, see our docs. Getting Started Create virtual environmen

Unmand 4 Jul 22, 2022
A collection of tips for using MISP.

MISP Tip of the Week A collection of tips for using MISP. Published via BelgoMISP (todo) and this repository. Available in MD and JSON. Do you want to

Koen Van Impe 52 Jan 07, 2023