Learning with Subset Stacking

Last update: Oct 04, 2022

Overview

Learning with Subset Stacking (LESS)

LESS is a new supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator.

Installation

pip install less-learn

Testing

Here is how you can use LESS for regression (we are working on classification):

import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from less import LESSRegressor

# Synthetic dataset (X, y)
xvals = np.arange(-10, 10, 0.1) # domain
num_of_samples = 200
X = np.zeros((num_of_samples, 1))
y = np.zeros(num_of_samples)
for i in range(num_of_samples):
    xran = -10 + 20*np.random.rand()
    X[i] = xran
    y[i] = 10*np.sin(xran) + 2.5*np.random.randn()

# Train and test split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.3)

# LESS fit() & predict()
LESS_model = LESSRegressor()
LESS_model.fit(X_train, y_train)
y_pred = LESS_model.predict(X_test)
print('Test error of LESS: {0:.2f}'.format(mean_squared_error(y_pred, y_test)))

Tutorials

Our two-part tutorial aims at getting you familiar with LESS. If you want to try the tutorials on your own computer, then you also need to install the following additional packages: pandas, matplotlib, and seaborn.

Citation

Our software can be cited as:

  @misc{LESS,
    author = "Ilker Birbil",
    title = "LESS: LEarning with Subset Stacking",
    year = 2021,
    url = "https://github.com/sibirbil/LESS/"
  }

Acknowledgments

We thank Oguz Albayrak for his help with structuring our Python scripts.

Learning with Subset Stacking

Related tags

Overview

Learning with Subset Stacking (LESS)

Installation

Testing

Tutorials

Citation

Acknowledgments

Owner

S. Ilker Birbil

Official implementation of CVPR2020 paper "Deep Generative Model for Robust Imbalance Classification"

Streaming over lightweight data transformations

Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version

OntoProtein: Protein Pretraining With Ontology Embedding

Machine Learning toolbox for Humans

pytorch implementation of trDesign

Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

All course materials for the Zero to Mastery Machine Learning and Data Science course.

[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

Code release for Universal Domain Adaptation(CVPR 2019)

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Self-driving car env with PPO algorithm from stable baseline3

Toolbox to analyze temporal context invariance of deep neural networks

[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

Continuous Conditional Random Field Convolution for Point Cloud Segmentation

Point Cloud Denoising input segmentation output raw point-cloud valid/clear fog rain de-noised Abstract Lidar sensors are frequently used in environme