IMBENS: class-imbalanced ensemble learning in Python.

Overview

IMBENS: class-imbalanced ensemble learning in Python.

Documentation Status

Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [็ŸฅไนŽ/Zhihu] [ไธญๆ–‡README] [arXiv]

Paper: IMBENS: Ensemble Class-imbalanced Learning in Python

imbalanced-ensemble (IMBENS, imported as imbalanced_ensemble) is a Python toolbox for quick implementation, modification, evaluation, and visualization of ensemble learning algorithms for class-imbalanced data. The problem of learning from imbalanced data is known as imbalanced learning or long-tail learning (under multi-class scenario). See related papers/libraries/resources here.

Currently (v0.1), IMBENS includes more than 15 ensemble imbalanced learning algorithms, from the classical SMOTEBoost (2003), RUSBoost (2010) to recent SPE (2020), from resampling to cost-sensitive learning. More algorithms will be included in the future. We also provide detailed documentation and examples across various algorithms. See full list of implemented methods here.

IMBENS is featured for:

  • ๐ŸŽ Unified, easy-to-use APIs, detailed documentation and examples.
  • ๐ŸŽ Capable for out-of-the-box multi-class imbalanced (long-tailed) learning.
  • ๐ŸŽ Optimized performance with parallelization when possible using joblib.
  • ๐ŸŽ Powerful, customizable, interactive training logging and visualizer.
  • ๐ŸŽ Full compatibility with other popular packages like scikit-learn and imbalanced-learn.

API Demo:

# Train an SPE classifier
from imbalanced_ensemble.ensemble import SelfPacedEnsembleClassifier
clf = SelfPacedEnsembleClassifier(random_state=42)
clf.fit(X_train, y_train)

# Predict with an SPE classifier
y_pred = clf.predict(X_test)

Table of Contents

Citing us

If you find IMBENS helpful in your work or research, please consider citing our work. We would greatly appreciate citations to the following paper [PDF]:

@article{liu2021imbens,
  title={IMBENS: Ensemble Class-imbalanced Learning in Python},
  author={Liu, Zhining and Wei, Zhepei and Yu, Erxin and Huang, Qiang and Guo, Kai and Yu, Boyang and Cai, Zhaonian and Ye, Hangting and Cao, Wei and Bian, Jiang and Wei, Pengfei and Jiang, Jing and Chang, Yi},
  journal={arXiv preprint arXiv:2111.12776},
  year={2021}
}

Installation

It is recommended to use pip for installation.
Please make sure the latest version is installed to avoid potential problems:

$ pip install imbalanced-ensemble            # normal install
$ pip install --upgrade imbalanced-ensemble  # update if needed

Or you can install imbalanced-ensemble by clone this repository:

$ git clone https://github.com/ZhiningLiu1998/imbalanced-ensemble.git
$ cd imbalanced-ensemble
$ pip install .

imbalanced-ensemble requires following dependencies:

Highlights

  • ๐ŸŽ Unified, easy-to-use API design.
    All ensemble learning methods implemented in IMBENS share a unified API design. Similar to sklearn, all methods have functions (e.g., fit(), predict(), predict_proba()) that allow users to deploy them with only a few lines of code.
  • ๐ŸŽ Extended functionalities, wider application scenarios.
    All methods in IMBENS are ready for multi-class imbalanced classification. We extend binary ensemble imbalanced learning methods to get them to work under the multi-class scenario. Additionally, for supported methods, we provide more training options like class-wise resampling control, balancing scheduler during the ensemble training process, etc.
  • ๐ŸŽ Detailed training log, quick intuitive visualization.
    We provide additional parameters (e.g., eval_datasets, eval_metrics, training_verbose) in fit() for users to control the information they want to monitor during the ensemble training. We also implement an EnsembleVisualizer to quickly visualize the ensemble estimator(s) for providing further information/conducting comparison. See an example here.
  • ๐ŸŽ Wide compatiblilty.
    IMBENS is designed to be compatible with scikit-learn (sklearn) and also other compatible projects like imbalanced-learn. Therefore, users can take advantage of various utilities from the sklearn community for data processing/cross-validation/hyper-parameter tuning, etc.

List of implemented methods

Currently (v0.1.3, 2021/06), 16 ensemble imbalanced learning methods were implemented:
(Click to jump to the document page)

Note: imbalanced-ensemble is still under development, please see API reference for the latest list.

5-min Quick Start with IMBENS

Here, we provide some quick guides to help you get started with IMBENS.
We strongly encourage users to check out the example gallery for more comprehensive usage examples, which demonstrate many advanced features of IMBENS.

A minimal working example

Taking self-paced ensemble [1] as an example, it only requires less than 10 lines of code to deploy it:

>>> from imbalanced_ensemble.ensemble import SelfPacedEnsembleClassifier
>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import train_test_split
>>> 
>>> X, y = make_classification(n_samples=1000, n_classes=3,
...                            n_informative=4, weights=[0.2, 0.3, 0.5],
...                            random_state=0)
>>> X_train, X_test, y_train, y_test = train_test_split(
...                            X, y, test_size=0.2, random_state=42)
>>> clf = SelfPacedEnsembleClassifier(random_state=0)
>>> clf.fit(X_train, y_train)
SelfPacedEnsembleClassifier(...)
>>> clf.predict(X_test)  
array([...])

Visualize ensemble classifiers

The imbalanced_ensemble.visualizer sub-module provide an ImbalancedEnsembleVisualizer. It can be used to visualize the ensemble estimator(s) for further information or comparison. Please refer to visualizer documentation and examples for more details.

Fit an ImbalancedEnsembleVisualizer

from imbalanced_ensemble.ensemble import SelfPacedEnsembleClassifier
from imbalanced_ensemble.ensemble import RUSBoostClassifier
from imbalanced_ensemble.ensemble import EasyEnsembleClassifier
from sklearn.tree import DecisionTreeClassifier

# Fit ensemble classifiers
init_kwargs = {'base_estimator': DecisionTreeClassifier()}
ensembles = {
    'spe': SelfPacedEnsembleClassifier(**init_kwargs).fit(X_train, y_train),
    'rusboost': RUSBoostClassifier(**init_kwargs).fit(X_train, y_train),
    'easyens': EasyEnsembleClassifier(**init_kwargs).fit(X_train, y_train),
}

# Fit visualizer
from imbalanced_ensemble.visualizer import ImbalancedEnsembleVisualizer
visualizer = ImbalancedEnsembleVisualizer().fit(ensembles=ensembles)

Plot performance curves

fig, axes = visualizer.performance_lineplot()

Plot confusion matrices

fig, axes = visualizer.confusion_matrix_heatmap()

Customizing training log

All ensemble classifiers in IMBENS support customizable training logging. The training log is controlled by 3 parameters eval_datasets, eval_metrics, and training_verbose of the fit() method. Read more details in the fit documentation.

Enable auto training log

clf.fit(..., train_verbose=True)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ             โ”ƒ                          โ”ƒ            Data: train             โ”ƒ
โ”ƒ #Estimators โ”ƒ    Class Distribution    โ”ƒ               Metric               โ”ƒ
โ”ƒ             โ”ƒ                          โ”ƒ  acc    balanced_acc   weighted_f1 โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ      1      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.838      0.877          0.839    โ”ƒ
โ”ƒ      5      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.924      0.949          0.924    โ”ƒ
โ”ƒ     10      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.954      0.970          0.954    โ”ƒ
โ”ƒ     15      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.979      0.986          0.979    โ”ƒ
โ”ƒ     20      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.990      0.993          0.990    โ”ƒ
โ”ƒ     25      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.994      0.996          0.994    โ”ƒ
โ”ƒ     30      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.988      0.992          0.988    โ”ƒ
โ”ƒ     35      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.999      0.999          0.999    โ”ƒ
โ”ƒ     40      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.995      0.997          0.995    โ”ƒ
โ”ƒ     45      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.995      0.997          0.995    โ”ƒ
โ”ƒ     50      โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.993      0.995          0.993    โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ    final    โ”ƒ {0: 150, 1: 150, 2: 150} โ”ƒ 0.993      0.995          0.993    โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›

Customize granularity and content of the training log

clf.fit(..., 
        train_verbose={
            'granularity': 10,
            'print_distribution': False,
            'print_metrics': True,
        })
Click to view example output
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ             โ”ƒ            Data: train             โ”ƒ
โ”ƒ #Estimators โ”ƒ               Metric               โ”ƒ
โ”ƒ             โ”ƒ  acc    balanced_acc   weighted_f1 โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ      1      โ”ƒ 0.964      0.970          0.964    โ”ƒ
โ”ƒ     10      โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”ƒ     20      โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”ƒ     30      โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”ƒ     40      โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”ƒ     50      โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ    final    โ”ƒ 1.000      1.000          1.000    โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›

Add evaluation dataset(s)

  clf.fit(..., 
          eval_datasets={
              'valid': (X_valid, y_valid)
          })
Click to view example output
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ             โ”ƒ            Data: train             โ”ƒ            Data: valid             โ”ƒ
โ”ƒ #Estimators โ”ƒ               Metric               โ”ƒ               Metric               โ”ƒ
โ”ƒ             โ”ƒ  acc    balanced_acc   weighted_f1 โ”ƒ  acc    balanced_acc   weighted_f1 โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ      1      โ”ƒ 0.939      0.961          0.940    โ”ƒ 0.935      0.933          0.936    โ”ƒ
โ”ƒ     10      โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.971      0.974          0.971    โ”ƒ
โ”ƒ     20      โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.982      0.981          0.982    โ”ƒ
โ”ƒ     30      โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.983      0.983          0.983    โ”ƒ
โ”ƒ     40      โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.983      0.982          0.983    โ”ƒ
โ”ƒ     50      โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.983      0.982          0.983    โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ    final    โ”ƒ 1.000      1.000          1.000    โ”ƒ 0.983      0.982          0.983    โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›

Customize evaluation metric(s)

from sklearn.metrics import accuracy_score, f1_score
clf.fit(..., 
        eval_metrics={
            'acc': (accuracy_score, {}),
            'weighted_f1': (f1_score, {'average':'weighted'}),
        })
Click to view example output
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ             โ”ƒ     Data: train      โ”ƒ     Data: valid      โ”ƒ
โ”ƒ #Estimators โ”ƒ        Metric        โ”ƒ        Metric        โ”ƒ
โ”ƒ             โ”ƒ  acc    weighted_f1  โ”ƒ  acc    weighted_f1  โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ      1      โ”ƒ 0.942      0.961     โ”ƒ 0.919      0.936     โ”ƒ
โ”ƒ     10      โ”ƒ 1.000      1.000     โ”ƒ 0.976      0.976     โ”ƒ
โ”ƒ     20      โ”ƒ 1.000      1.000     โ”ƒ 0.977      0.977     โ”ƒ
โ”ƒ     30      โ”ƒ 1.000      1.000     โ”ƒ 0.981      0.980     โ”ƒ
โ”ƒ     40      โ”ƒ 1.000      1.000     โ”ƒ 0.980      0.979     โ”ƒ
โ”ƒ     50      โ”ƒ 1.000      1.000     โ”ƒ 0.981      0.980     โ”ƒ
โ”ฃโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‹โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ซ
โ”ƒ    final    โ”ƒ 1.000      1.000     โ”ƒ 0.981      0.980     โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ปโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›

About imbalanced learning

Class-imbalance (also known as the long-tail problem) is the fact that the classes are not represented equally in a classification problem, which is quite common in practice. For instance, fraud detection, prediction of rare adverse drug reactions and prediction gene families. Failure to account for the class imbalance often causes inaccurate and decreased predictive performance of many classification algorithms. Imbalanced learning aims to tackle the class imbalance problem to learn an unbiased model from imbalanced data.

For more resources on imbalanced learning, please refer to awesome-imbalanced-learning.

Acknowledgements

Many samplers and utilities are adapted from imbalanced-learn, which is an amazing project!

References

# Reference
[1] Zhining Liu, Wei Cao, Zhifeng Gao, Jiang Bian, Hechang Chen, Yi Chang, and Tie-Yan Liu. 2019. Self-paced Ensemble for Highly Imbalanced Massive Data Classification. 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 2020, pp. 841-852.
[2] X.-Y. Liu, J. Wu, and Z.-H. Zhou, Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 2, pp. 539โ€“550, 2009.
[3] Chen, Chao, Andy Liaw, and Leo Breiman. โ€œUsing random forest to learn imbalanced data.โ€ University of California, Berkeley 110 (2004): 1-12.
[4] C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, Rusboost: A hybrid approach to alleviating class imbalance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 40, no. 1, pp. 185โ€“197, 2010.
[5] Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. AAAI/IAAI, 1997, 546-551.
[6] N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer, Smoteboost: Improving prediction of the minority class in boosting. in European conference on principles of data mining and knowledge discovery. Springer, 2003, pp. 107โ€“119
[7] S. Wang and X. Yao, Diversity analysis on imbalanced data sets by using ensemble models. in 2009 IEEE Symposium on Computational Intelligence and Data Mining. IEEE, 2009, pp. 324โ€“331.
[8] Fan, W., Stolfo, S. J., Zhang, J., & Chan, P. K. (1999, June). AdaCost: misclassification cost-sensitive boosting. In Icml (Vol. 99, pp. 97-105).
[9] Shawe-Taylor, G. K. J., & Karakoulas, G. (1999). Optimizing classifiers for imbalanced training sets. Advances in neural information processing systems, 11(11), 253.
[10] Viola, P., & Jones, M. (2001). Fast and robust classification using asymmetric adaboost and a detector cascade. Advances in Neural Information Processing System, 14.
[11] Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119-139.
[12] Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
[13] Guillaume Lemaรฎtre, Fernando Nogueira, and Christos K. Aridas. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17):1โ€“5, 2017.
You might also like...
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

 An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

zeus is a Python implementation of the Ensemble Slice Sampling method.
zeus is a Python implementation of the Ensemble Slice Sampling method.

zeus is a Python implementation of the Ensemble Slice Sampling method. Fast & Robust Bayesian Inference, Efficient Markov Chain Monte Carlo (MCMC), Bl

Neural Ensemble Search for Performant and Calibrated Predictions
Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation
Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algorithms that do the job in the least jargon possible and examples to guide you through every step of the way.

Comments
  • Bug :AttributeError: can't set attribute

    Bug :AttributeError: can't set attribute

    hello ,when i use the code as follow,the will be some errors, EasyEnsembleClassifier was used

    from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.metrics import balanced_accuracy_score from sklearn.ensemble import BaggingClassifier from sklearn.tree import DecisionTreeClassifier from imbalanced_ensemble.ensemble import EasyEnsembleClassifier from collections import Counter

    X, y = make_classification(n_classes=2, class_sep=2, weights=[0.1, 0.9], n_informative=3, n_redundant=1, flip_y=0, n_features=20, n_clusters_per_class=1, n_samples=1000, random_state=10) print('Original dataset shape %s' % Counter(y))

    Original dataset shape Counter({{1: 900, 0: 100}})

    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0) bbc = EasyEnsembleClassifier(random_state=42) bbc.fit(X_train, y_train) EasyEnsembleClassifier(...) y_pred = bbc.predict(X_test) print(y_pred)

    Traceback (most recent call last): File "C:/Users/Administrator/PycharmProjects/pythonProject5/test-easy.py", line 16, in bbc.fit(X_train, y_train) File "C:\Users\Administrator\PycharmProjects\pythonProject5\venv\lib\site-packages\imbalanced_ensemble\utils_validation.py", line 602, in inner_f return f(**kwargs) File "C:\Users\Administrator\PycharmProjects\pythonProject5\venv\lib\site-packages\imbalanced_ensemble\ensemble\under_sampling\easy_ensemble.py", line 275, in fit return self._fit(X, y, File "C:\Users\Administrator\PycharmProjects\pythonProject5\venv\lib\site-packages\imbalanced_ensemble\utils_validation.py", line 602, in inner_f return f(**kwargs) File "C:\Users\Administrator\PycharmProjects\pythonProject5\venv\lib\site-packages\imbalanced_ensemble\ensemble_bagging.py", line 359, in fit n_samples, self.n_features = X.shape AttributeError: can't set attribute

    bug 
    opened by leaphan 8
  • EasyEnsembleClassifier็”จไธไบ†ไบ†

    EasyEnsembleClassifier็”จไธไบ†ไบ†

    ๆ นๆฎไฝ ็š„ๅœจ่ฟ™ๅ„ฟhttps://imbalanced-ensemble.readthedocs.io/en/latest/auto_examples/classification/plot_digits.html ็š„ไปฃ็ ๏ผŒๅฐ†ๅˆ†็ฑปๅ™จๆ”นๆˆEasyEnsembleClassifierๅฏไปฅๅค็Žฐ่ฟ™ไธช้—ฎ้ข˜๏ผŒไผšๅ‡บ็Žฐ๏ผš image AttributeError: can't set attribute่ฟ™ไธช้—ฎ้ข˜ใ€‚

    bug 
    opened by hannanhtang 7
  • ENH add early_termination control for boosting-based methods

    ENH add early_termination control for boosting-based methods

    The early termination in sklearn.ensemble.AdaBoostClassifier may be too strict under certain scenarios (only 1 base classifier is trained), which greatly hinders the performance of boosting-based ensemble imbalanced learning methods.

    It should make more sense to add a parameter that allows the user to decide whether to enable strict early termination.

    enhancement 
    opened by ZhiningLiu1998 2
  • [BUG] Bagging-based methods do not work with base clf that do not support sample_weight

    [BUG] Bagging-based methods do not work with base clf that do not support sample_weight

    Resampling + Bagging clf (e.g., OverBagging) raises error when used with base estimators that do not support sample_weight (e.g., sklearn.KNeighborsClassifier).

    opened by ZhiningLiu1998 2
Owner
Zhining Liu
M.Sc. student at Jilin University.
Zhining Liu
Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Grow Function: A 3D Stacked Bifurcating Double Deep Cellular Automata which differentiates using a Genetic Algorithm... TLDR;High Def Trees that you can mint as NFTs on Solana

Nathaniel Gibson 4 Oct 08, 2022
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
fcn by tensorflow

Update An example on how to integrate this code into your own semantic segmentation pipeline can be found in my KittiSeg project repository. tensorflo

9 May 22, 2022
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou 579 Dec 22, 2022
IPATool-py: download ipa easily

IPATool-py Python version of IPATool! Installation pip3 install -r requirements.txt Usage Quickstart: download app with specific bundleId into DIR: p

159 Dec 30, 2022
โœ‚๏ธ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.

EyeLipCropper EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video. The whole process consists of three parts: frame extracti

Zi-Han Liu 9 Oct 25, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Implementation of ReSeg using PyTorch

Implementation of ReSeg using PyTorch ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation Pascal-Part Annotations Pascal VOC 2010

Onur Kaplan 46 Nov 23, 2022
[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

TransMaS This repository is the official pytorch implementation of the following paper: NIPS2021 Mixed Supervised Object Detection by TransferringMask

BCMI 49 Jul 27, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

Shintaro Shiba 63 Sep 25, 2022
The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color Overview Code and dataset for The World of an Octopus: H

1 Nov 13, 2021
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 06, 2023
Code for the paper Hybrid Spectrogram and Waveform Source Separation

Demucs Music Source Separation This is the 3rd release of Demucs (v3), featuring hybrid source separation. For the waveform only Demucs (v2): Go this

Meta Research 4.8k Jan 04, 2023
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
This is an unofficial implementation of the paper โ€œStudent-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detectionโ€.

This is an unofficial implementation of the paper โ€œStudent-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detectionโ€.

haifeng xia 32 Oct 26, 2022
[CVPR 2022] Structured Sparse R-CNN for Direct Scene Graph Generation

Structured Sparse R-CNN for Direct Scene Graph Generation Our paper Structured Sparse R-CNN for Direct Scene Graph Generation has been accepted by CVP

Multimedia Computing Group, Nanjing University 44 Dec 23, 2022
PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021) This repo presents PyTorch implementation of M

Evgeny 79 Dec 19, 2022
So-ViT: Mind Visual Tokens for Vision Transformer

So-ViT: Mind Visual Tokens for Vision Transformer โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒโ€ƒ Introduction This repository contains the source code under PyTorch framework and models trai

Jiangtao Xie 44 Nov 24, 2022
No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

This repository contains the implementation for the paper: No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consiste

Alireza Golestaneh 75 Dec 30, 2022