InfiniteBoost: building infinite ensembles with gradient descent

Last update: Jan 03, 2023

Overview

InfiniteBoost

Code for a paper
InfiniteBoost: building infinite ensembles with gradient descent (arXiv:1706.01109).
A. Rogozhnikov, T. Likhomanenko

Description

InfiniteBoost is an approach to building ensembles which combines best sides of random forest and gradient boosting.

Trees in the ensemble encounter mistakes done by previous trees (as in gradient boosting), but due to modified scheme of encountering contributions the ensemble converges to the limit, thus avoiding overfitting (just as random forest).

Left: InfiniteBoost with automated search of capacity vs gradient boosting with different learning rates (shrinkages), right: random forest vs InfiniteBoost with small capacities.

More plots of comparison in research notebooks and in research/plots directory.

Reproducing research

Research is performed in jupyter notebooks (if you're not familiar, read why Jupyter notebooks are awesome).

You can use the docker image arogozhnikov/pmle:0.01 from docker hub. Dockerfile is stored in this repository (ubuntu 16 + basic sklearn stuff).

To run the environment (sudo is needed on Linux):

sudo docker run -it --rm -v /YourMountedDirectory:/notebooks -p 8890:8890 arogozhnikov/pmle:0.01

(and open localhost:8890 in your browser).

InfiniteBoost package

Self-written minimalistic implementation of trees as used for experiments against boosting.

Specific implementation was used to compare with random forest and based on the trees from scikit-learn package.

Code written in python 2 (expected to work with python 3, but not tested), some critical functions in fortran, so you need gfortran + openmp installed before installing the package (or simply use docker image).

pip install numpy
pip install .
# testing (optional)
cd tests && nosetests .

You can use implementation of trees from the package for your experiments, in this case please cite InfiniteBoost paper.

InfiniteBoost: building infinite ensembles with gradient descent

Related tags

Overview

InfiniteBoost

Description

Reproducing research

InfiniteBoost package

Owner

Alex Rogozhnikov

A Python step-by-step primer for Machine Learning and Optimization

Cryptocurrency price prediction and exceptions in python

Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.

Machine Learning Study 혼자 해보기

Python package for machine learning for healthcare using a OMOP common data model

This is my implementation on the K-nearest neighbors algorithm from scratch using Python

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Flightfare-Prediction - It is a Flightfare Prediction Web Application Using Machine learning,Python and flask

Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

A Time Series Library for Apache Spark

An easier way to build neural search on the cloud

A simple python program that draws a tree for incrementing values using the Collatz Conjecture.

2021 Machine Learning Security Evasion Competition

My capstone project for Udacity's Machine Learning Nanodegree

Code Repository for Machine Learning with PyTorch and Scikit-Learn

XManager: A framework for managing machine learning experiments 🧑‍🔬

Dive into Machine Learning

A Tools that help Data Scientists and ML engineers train and deploy ML models.

Machine Learning Algorithms