Stochastic Gradient Trees implementation in Python

Last update: Nov 18, 2022

Overview

Stochastic Gradient Trees - Python

Stochastic Gradient Trees¹ by Henry Gouk, Bernhard Pfahringer, and Eibe Frank implementation in Python. Based on the parer's accompanied repository code.

Python Version 3.7 or later

Used Python libraries:

numpy>=1.20.2
scipy>=1.6.2
pandas>=1.3.3
scikit-learn>=0.24.2

Usage:

    from StochasticGradientTree import StochasticGradientTreeClassifier

    from sklearn.model_selection import train_test_split
    from sklearn.datasets import load_breast_cancer
    from sklearn.metrics import confusion_matrix, accuracy_score, log_loss

    def train(X, y):

        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.34)
        
        tree = StochasticGradientTreeClassifier()

        tree.fit(X_train, y_train)
    
        y_pred = tree.predict(X_test)

        proba = tree.predict_proba(X_test)        

        acc_test = accuracy_score(y_test, y_pred)
        print(confusion_matrix(y_test, y_pred))
        print('Acc test: ', acc_test)
        print('Cross entropy loss: ', log_loss(y_test, proba))

        return tree, acc_test

    if __name__ == "__main__":

        breast = load_breast_cancer(as_frame=True)

        X = breast.frame.copy()
        y = breast.frame.target
        
        X.drop(['target'], axis=1, inplace=True) 

        tree, _ = train(X, y)

Binary classification example:

python classification_breast.py

Multiclass classification (using the One-vs-the-rest multiclass strategy):

python classification_iris.py

Regression example:

python regression_diabetes.py

Gouk, H., Pfahringer, B., and Frank, E. Stochastic gradient trees. In Proceedings of The Eleventh Asian Conference on Machine Learning, volume 101 of Proceedings of Machine Learning Research, pp. 1094–1109. PMLR, 2019. ↩

Stochastic Gradient Trees implementation in Python

Related tags

Overview

Stochastic Gradient Trees - Python

Python Version 3.7 or later

Used Python libraries:

Usage:

Binary classification example:

Multiclass classification (using the One-vs-the-rest multiclass strategy):

Regression example:

Owner

John Koumentis

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Get mutations in cluster by querying from LAPIS API

Repository created with LinkedIn profile analysis project done

Powerful, efficient particle trajectory analysis in scientific Python.

A real data analysis and modeling project - restaurant inspections

Very useful and necessary functions that simplify working with data

LynxKite: a complete graph data science platform for very large graphs and other datasets.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Stock Analysis dashboard Using Streamlit and Python

Functional Data Analysis, or FDA, is the field of Statistics that analyses data that depend on a continuous parameter.

follow-analyzer helps GitHub users analyze their following and followers relationship

Jupyter notebooks for the book "The Elements of Statistical Learning".

Stochastic Gradient Trees implementation in Python

Retail-Sim is python package to easily create synthetic dataset of retaile store.

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

PyTorch implementation for NCL (Neighborhood-enrighed Contrastive Learning)

PyIOmica (pyiomica) is a Python package for omics analyses.

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Stochastic Gradient Trees implementation in Python

Related tags

Overview

Stochastic Gradient Trees - Python

Python Version 3.7 or later

Used Python libraries:

Usage:

Binary classification example:

Multiclass classification (using the One-vs-the-rest multiclass strategy):

Regression example:

Footnotes

Owner

John Koumentis

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Get mutations in cluster by querying from LAPIS API

Repository created with LinkedIn profile analysis project done

Powerful, efficient particle trajectory analysis in scientific Python.

A real data analysis and modeling project - restaurant inspections

Very useful and necessary functions that simplify working with data

LynxKite: a complete graph data science platform for very large graphs and other datasets.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Stock Analysis dashboard Using Streamlit and Python

Functional Data Analysis, or FDA, is the field of Statistics that analyses data that depend on a continuous parameter.

follow-analyzer helps GitHub users analyze their following and followers relationship

Jupyter notebooks for the book "The Elements of Statistical Learning".

Stochastic Gradient Trees implementation in Python

Retail-Sim is python package to easily create synthetic dataset of retaile store.

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

PyTorch implementation for NCL (Neighborhood-enrighed Contrastive Learning)

PyIOmica (pyiomica) is a Python package for omics analyses.

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.