Hummingbird compiles trained ML models into tensor computation for faster inference.

Overview

Hummingbird

PyPI version coverage Gitter Downloads


Introduction

Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to seamlessly leverage neural network frameworks (such as PyTorch) to accelerate traditional ML models. Thanks to Hummingbird, users can benefit from: (1) all the current and future optimizations implemented in neural network frameworks; (2) native hardware acceleration; (3) having a unique platform to support for both traditional and neural network models; and have all of this (4) without having to re-engineer their models.

Currently, you can use Hummingbird to convert your trained traditional ML models into PyTorch, TorchScript, ONNX, and TVM). Hummingbird supports a variety of ML models and featurizers. These models include scikit-learn Decision Trees and Random Forest, and also LightGBM and XGBoost Classifiers/Regressors. Support for other neural network backends and models is on our roadmap.

Hummingbird also provides a convenient uniform "inference" API following the Sklearn API. This allows swapping Sklearn models with Hummingbird-generated ones without having to change the inference code. By converting the models to PyTorch and TorchScript it also becomes possible to serve them using TorchServe.

How Hummingbird Works

Hummingbird works by reconfiguring algorithmic operators such that we can perform more regular computations which are amenable to vectorized and GPU execution. Each operator is slightly different, and we incorporate multiple strategies. This example explains one of Hummingbird's strategies for translating a decision tree into tensors involving GEMM (GEneric Matrix Multiplication), where we implement the traversal of the tree using matrix multiplications. (GEMM is one of the three tree conversion strategies we currently support.)


Simple decision tree

In this example, the decision tree has four decision nodes (orange), and five leaf nodes (blue). The tree takes a feature vector with five elements as input. For example, assume that we want to calculate the output of this observation:

Step 1: Multiply the input tensor with tensor A (computed from the decision tree model above) that captures the relationship between input features and internal nodes. Then compare it with tensor B which is set to the value of each internal node (orange) to create the tensor input path that represents the path from input to node. In this case, the tree model has 4 conditions and the input vector is 5, therefore, the shape of tensor A is 5x4 and tensor B is 1x4.

Step 2: The input path tensor will be multiplied with tensor C that captures whether the internal node is a parent of that internal node, and if so, whether it is in the left or right sub-tree (left = 1, right =-1, otherwise =0) and then check the equals with tensor D that captures the count of the left child of its parent in the path from a leaf node to the tree root to create the tenor output path that represents the path from node to output. In this case, this tree model has 5 outputs with 4 conditions, therefore, the shape of tensor C is 4x5 and tensor D is 1x5.

Step 3: The output path will be multiplied with tensor E that captures the mapping between leaf nodes to infer the final prediction. In this case, tree model has 5 outputs, therefore, shape of tensor E is 5x1.

And now Hummingbird has compiled a tree-based model using the GEMM strategy! For more details, please see Figure 3 of our paper.

Thank you to Chien Vu for contributing the graphics and descriptions in his blog for this example!

Installation

Hummingbird was tested on Python >= 3.6 on Linux, Windows and MacOS machines. (Python 3.5 is suppored up to hummingbird-ml==0.2.1.) It is recommended to use a virtual environment (See: python3 venv doc or Using Python environments in VS Code.)

Hummingbird requires PyTorch >= 1.4.0. Please go here for instructions on how to install PyTorch based on your platform and hardware.

Once PyTorch is installed, you can get Hummingbird from pip with:

pip install hummingbird-ml

If you require the optional dependencies lightgbm and xgboost, you can use:

pip install hummingbird-ml[extra]

See also Troubleshooting for common problems.

Examples

See the notebooks section for examples that demonstrate use and speedups.

In general, Hummingbird syntax is very intuitive and minimal. To run your traditional ML model on DNN frameworks, you only need to import hummingbird.ml and add convert(model, 'dnn_framework') to your code. Below is an example using a scikit-learn random forest model and PyTorch as target framework.

import numpy as np
from sklearn.ensemble import RandomForestClassifier
from hummingbird.ml import convert, load

# Create some random data for binary classification
num_classes = 2
X = np.random.rand(100000, 28)
y = np.random.randint(num_classes, size=100000)

# Create and train a model (scikit-learn RandomForestClassifier in this case)
skl_model = RandomForestClassifier(n_estimators=10, max_depth=10)
skl_model.fit(X, y)

# Use Hummingbird to convert the model to PyTorch
model = convert(skl_model, 'pytorch')

# Run predictions on CPU
model.predict(X)

# Run predictions on GPU
model.to('cuda')
model.predict(X)

# Save the model
model.save('hb_model')

# Load the model back
model = load('hb_model')

Documentation

The API documentation is here.

You can also read about Hummingbird in our blog post here.

For more details on the vision and on the technical details related to Hummingbird, please check our papers:

Contributing

We welcome contributions! Please see the guide on Contributing.

Also, see our roadmap of planned features.

Community

Join our community! Gitter

For more formal enquiries, you can contact us.

Authors

  • Supun Nakandala
  • Matteo Interlandi
  • Karla Saur

License

MIT License

Comments
  • [WIP] Add sklearn's HistGradientBoosting

    [WIP] Add sklearn's HistGradientBoosting

    Closes #64:

    • This PR adds sklearn's HistGradientBoostingClassifier to hummingbird.

    • It modifies the two functions convert_sklearn_gbdt_classifier() and get_parameters_for_sklearn_common() to handle the HistGradientBoostingClassifier model attributes.

    • It also updates the documentation to include the new HistGradientBoostingClassifier.

    opened by ahmedkrmn 27
  • Support lightgbm >= 3

    Support lightgbm >= 3

    setup.py requires lightgbm < 3, I'm using the current version of lightgbm so hummingbird reports that it's not installed. Is there any blocking incompatibility or is it just that the requirement version is out of date?

    opened by memeplex 18
  • AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'

    AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'

    Code:

    hummingmodel = hummingbird.ml.operator_converters.xgb.convert_sklearn_xgb_classifier(model, 'pytorch',extra_config={"n_features":18})

    Error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /var/folders/f2/9tbmpg411hndwc482xn850br0000gn/T/ipykernel_2889/1708670718.py in <module>
          1 # Use Hummingbird to convert the model to PyTorch
    ----> 2 hummingmodel = hummingbird.ml.operator_converters.xgb.convert_sklearn_xgb_classifier(model, 'pytorch',extra_config={"n_features":18})
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/operator_converters/xgb.py in convert_sklearn_xgb_classifier(operator, device, extra_config)
        102              Please pass "n_features:N" as extra configuration to the converter or fill a bug report.'
        103         )
    --> 104     tree_infos = operator.raw_operator.get_booster().get_dump()
        105     n_classes = operator.raw_operator.n_classes_
        106 
    
    AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'
    

    XGB Version: 1.6.1 Hummingbird Version: 0.4.7

    Any idea about this issue? What other configurations are required to make this work?

    opened by dintellect 17
  • Add float64 support in hummingbird

    Add float64 support in hummingbird

    Addresses issue #51

    This PR:

    • Add support for float64 input data based on discussion in issue #51

    It appears Pytorch expects both inputs and model params (weights) to be of the same data type. The weights are in float32. In this failure case, the input is float64 type. In order to have the same type, it is easier (and computationally efficient) to down cast the inputs.

    • Add a few tests around the same change.

    Pending:

    • I did validate other supported operators as well (Sklearn GBDT, HGBDT; XGB, etc). Should I add tests for them as well?
    • I feel we can update the notebooks (and other examples) as well removing the casting from float64 to float32. This makes the example simpler and less mysterious. Should I do that?

    Validated that all we can now use float32/float64 data.

    opened by KranthiGV 16
  • Fixing test flakiness

    Fixing test flakiness

    Hi,

    The test test_tree_regressors_multioutput_regression is flaky. It failed 41/3000 times that I ran. It seems the absolute difference can be much higher than the current threshold (1e-5).

    Empirically, I observed a value of a maximum absolute difference of upto 3.7. Based on my experiments, the 99th percentile seems to be close to 4.5. Hence, I set this bound accordingly.

    Please let me know if this makes sense. I am assuming there are no bugs in this case. I would be happy to incorporate any other suggestions that you may have.

    Thanks!

    opened by sleepy-owl 14
  • pip install of Version 0.2.0 dosen't work

    pip install of Version 0.2.0 dosen't work

    I try'd to install the new hummingbird 0.2.0 release in a new venv and the installation failed because the hummingbird wheel specified a pytorch version that seems to have a broken wheel or the hummingbird wheel is missing the full pytorch installation command (which is: pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html). It's also not possible to get this exact pytorch install command from the official pytorch website. So its not possible to install the requirered pytorch version by just looking at the pytorch website where you only can get the 1.7.1 or the 1.6.0 version of pytorch.

    Right now you have to find the pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html command somewhere in the internet and install this specific version before you can install hummingbird-ml

    Collecting hummingbird-ml
      Downloading hummingbird_ml-0.2.0-py2.py3-none-any.whl (151 kB)
    Collecting numpy<=1.19.4,>=1.15
      Using cached numpy-1.19.4-cp37-cp37m-win_amd64.whl (12.9 MB)
    Collecting onnxconverter-common<=1.7.0,>=1.6.0
      Using cached onnxconverter_common-1.7.0-py2.py3-none-any.whl (64 kB)
    Collecting scikit-learn<=0.23.2,>=0.21.3
      Using cached scikit_learn-0.23.2-cp37-cp37m-win_amd64.whl (6.8 MB)
    Collecting joblib>=0.11
      Downloading joblib-1.0.0-py3-none-any.whl (302 kB)
    Collecting scipy>=0.19.1
      Downloading scipy-1.6.0-cp37-cp37m-win_amd64.whl (32.5 MB)
    Collecting threadpoolctl>=2.0.0
      Using cached threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
    Collecting torch<=1.7.0,>=1.4.*
      Downloading torch-1.7.0-cp37-cp37m-win_amd64.whl (184.0 MB)
    
    ERROR: torch has an invalid wheel, .dist-info directory not found
    
    opened by speedfreakw 13
  • float64 issue

    float64 issue

    At the moment, HB only works with float32. You must cast float64 to float32 for correct results. (You will get an error with the gemm algorithm). We need to fix this.

    Ex: in scikit-learn-random-forest-example.ipynb we must cast X as follows: X = X[0:nrows].astype('|f4')

    opened by ksaur 13
  • Add delete location bool param to PyTorchContainer load() call

    Add delete location bool param to PyTorchContainer load() call

    What? Adds delete_unzip_location_folder param to load() method of PyTorchSklearnContainer to avoid implicit deletion of model artifact supplied to load() method. The changes in this PR are backward compatible.

    Why? Related issue: https://github.com/microsoft/hummingbird/issues/557

    opened by akshjain83 12
  • Performance Issues Using the TVM Backend

    Performance Issues Using the TVM Backend

    I am running inference on an XGBoost model using Hummingbird on a desktop CPU target. I've installed Pytorch, Hummingbird and TVM into a conda environment (TVM was built from source and linked to llvm-10). I have trained models serialized into XGBoost JSONs and am creating XGBoost sklearn objects from them. I am able to compile these models using Hummingbird and run them. My python code to run on PyTorch is as follows:

    xgb_model = xgb.XGBRegressor() xgb_model.load_model(model_json) hb_model = convert(xgb_model, 'pytorch') // ... pred = hb_model.predict(batch)

    My python code to compile the model using the TVM backend is the following:

    xgb_model = xgb.XGBRegressor() xgb_model.load_model(model_json) hb_model = convert(xgb_model, 'tvm', test_input=inputs[0:batch_size]) // ... pred = hb_model.predict(batch)

    Is this the right way to compile models (especially using TVM)? Do I need to enable any other TVM features (eg. auto-tuning) through the Hummingbird API? For my models, the performance of the inference with both the PyTorch and the TVM backends are roughly the same which makes me think that I may be using the TVM backend wrong. I was able to verify that the predict methods are computing the correct predictions.

    opened by asprasad 11
  • KernelPCA + PyTorch

    KernelPCA + PyTorch

    Hello, I'm trying to utilize GPU with Pytorch backend to speed up a Kernel PCA operation. However, when I convert to Pytorch, it ends up taking about 9x longer to run the .transform() function. Additionally, I'm not seeing any GPU utilization at all. Sklearn: 0.8 seconds Pytorch + CPU: 7.8 seconds Pytorch + GPU (supposedly, but again, not seeing any GPU utilization): 7.9 seconds

    Would it be possible for you to look into this? Have already checked that CUDA is configured correctly with torch.cuda.is_available(). Thanks!

    bug 
    opened by carolinemckee 11
  • CUDA out of memory

    CUDA out of memory

    Hi there,

    When I'm testing Hummingbird with GPU for more and deeper trees (worked fine if I have less trees), I got OOM error:

    "CUDA out of memory. Tried to allocate 5.96 GiB (GPU 0; 11.91 GiB total capacity; 6.47 GiB already allocated; 4.74 GiB free; 6.48 GiB reserved in total by PyTorch)"

    I'm using: CUDA 10.1, Pytorch 1.5.1+cu101 RandomForestRegressor from sklearn total number of samples: 800k number of features: 100 number of trees: 1000 average tree depth: 15 average number of nodes per tree: 3700

    Is it expected? Any idea how to make it work for more and deeper trees?

    Thanks!

    opened by zhanjiezhu 11
  • MissingConverter: Unable to find converter <class 'sklearn.preprocessing._encoders.OrdinalEncoder'>

    MissingConverter: Unable to find converter

    ---------------------------------------------------------------------------
    MissingConverter                          Traceback (most recent call last)
    /var/folders/f2/9tbmpg411hndwc482xn850br0000gn/T/ipykernel_27005/3005074338.py in <module>
    ----> 1 hb_model = convert(clf, 'torch',X_train[0:1])
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in convert(model, backend, test_input, device, extra_config)
        442     """
        443     assert constants.REMAINDER_SIZE not in extra_config
    --> 444     return _convert_common(model, backend, test_input, device, extra_config)
        445 
        446 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in _convert_common(model, backend, test_input, device, extra_config)
        403         return _convert_sparkml(model, backend_formatted, test_input, device, extra_config)
        404 
    --> 405     return _convert_sklearn(model, backend_formatted, test_input, device, extra_config)
        406 
        407 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in _convert_sklearn(model, backend, test_input, device, extra_config)
        106     # We modify the scikit learn model during translation.
        107     model = deepcopy(model)
    --> 108     topology = parse_sklearn_api_model(model, extra_config)
        109 
        110     # Convert the Topology object into a PyTorch model.
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in parse_sklearn_api_model(model, extra_config)
         63     # Parse the input scikit-learn model into a topology object.
         64     # Get the outputs of the model.
    ---> 65     outputs = _parse_sklearn_api(topology, model, inputs)
         66 
         67     # Declare output variables.
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        228     tmodel = type(model)
        229     if tmodel in sklearn_api_parsers_map:
    --> 230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
        232         outputs = _parse_sklearn_single_model(topology, model, inputs)
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_pipeline(topology, model, inputs)
        274     """
        275     for step in model.steps:
    --> 276         inputs = _parse_sklearn_api(topology, step[1], inputs)
        277     return inputs
        278 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        228     tmodel = type(model)
        229     if tmodel in sklearn_api_parsers_map:
    --> 230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
        232         outputs = _parse_sklearn_single_model(topology, model, inputs)
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_column_transformer(topology, model, inputs)
        451                 )
        452         else:
    --> 453             var_out = _parse_sklearn_api(topology, model_obj, transform_inputs)[0]
        454             if model.transformer_weights is not None and name in model.transformer_weights:
        455                 # Create a Multiply node
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
    --> 232         outputs = _parse_sklearn_single_model(topology, model, inputs)
        233 
        234     return outputs
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_single_model(topology, model, inputs)
        250         raise RuntimeError("Parameter model must be an object not a " "string '{0}'.".format(model))
        251 
    --> 252     alias = get_sklearn_api_operator_name(type(model))
        253     this_operator = topology.declare_logical_operator(alias, model)
        254     this_operator.inputs = inputs
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/supported.py in get_sklearn_api_operator_name(model_type)
        463     """
        464     if model_type not in sklearn_api_operator_name_map:
    --> 465         raise MissingConverter("Unable to find converter for model type {}.".format(model_type))
        466     return sklearn_api_operator_name_map[model_type]
        467 
    
    MissingConverter: Unable to find a converter for model type <class 'sklearn.preprocessing._encoders.OrdinalEncoder'>.
    It usually means the pipeline being converted contains a
    transformer or a predictor with no corresponding converter implemented.
    Please fill an issue at https://github.com/microsoft/hummingbird.
    

    Which Scikit-learn pipeline operators do hummingbird support?

    enhancement 
    opened by dintellect 3
  • FLOPs counting for the converted model

    FLOPs counting for the converted model

    Could you please share some suggestions on FLOPs counting for the converted model?

    I have tried: thop : https://github.com/Lyken17/pytorch-OpCounter flop-counter : https://github.com/sovrasov/flops-counter.pytorch pthflops: https://github.com/1adrianb/pytorch-estimate-flops torchprofile: https://github.com/zhijian-liu/torchprofile deepspeed: https://www.deepspeed.ai/tutorials/flops-profiler/

    Seems non of them support the calculations of the converted models, any advice will be highly appreciated, thanks!

    opened by ChuniHiro 1
  • Kernel crashing while converting

    Kernel crashing while converting

    RF was previously defined.

    # Can crash sometimes
    # Convert model to pytroch with Hummingbird-ml
    RFconv = convert(RF, 'pytorch')
    
    # Save Model Converted
    RFconv.save(os.path.join(ResultsFolder, "RFmodel_V3"))
    

    When I ran this on macOS it crashes the Jupyter kernel, this doesn't happen in Windows.

    Screenshot 2022-12-15 at 16 14 19

    opened by EmanuelCastanho 4
  • Need new test dataset for xgb

    Need new test dataset for xgb

    Build workflow run

    ==================================== ERRORS ====================================
    _______________ ERROR collecting tests/test_xgboost_converter.py _______________
    ImportError while importing test module '/home/runner/work/hummingbird/hummingbird/tests/test_xgboost_converter.py'.
    Hint: make sure your test modules/packages have valid Python names.
    Traceback:
    /opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/importlib/__init__.py:127: in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
    tests/test_xgboost_converter.py:8: in <module>
        from sklearn.datasets import load_boston
    /opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/sklearn/datasets/__init__.py:156: in __getattr__
        raise ImportError(msg)
    E   ImportError: 
    E   `load_boston` has been removed from scikit-learn since version 1.2.
    
    opened by ksaur 0
  • Should SKLearn operators be assumed to produce a single output?

    Should SKLearn operators be assumed to produce a single output?

    See https://github.com/microsoft/hummingbird/blob/main/hummingbird/ml/_parse.py#L256

    Consider models which implement predict and predict_proba functions. These return both label and probabilities as outputs. The current logic means that we cannot name the outputs in the hummingbird conversion step (ie with output_names argument to extra_config) and instead have to perform some ONNX graph surgery afterwards.

    opened by stillmatic 0
Releases(v0.4.7)
  • v0.4.7(Nov 29, 2022)

    What's Changed

    • This patch release fixes a bug in ONNX conversion and allows support of varying batch sizes by @stillmatic in https://github.com/microsoft/hummingbird/pull/654
    • Fixes deprecations in https://github.com/microsoft/hummingbird/pull/655

    New Contributors

    • Thank you to @stillmatic for catching and fixing this bug (https://github.com/microsoft/hummingbird/issues/653) and for the additional maintenance work!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.6...v0.4.7

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.7.tar.gz(533.74 KB)
    hummingbird_ml-0.4.7-py2.py3-none-any.whl(158.37 KB)
  • v0.4.6(Nov 10, 2022)

    What's Changed

    Features

    • Adds Lasso, Ridge and ElasticNet by @fd0r in https://github.com/microsoft/hummingbird/pull/625
    • Added support for more decision conditions in trees and ONNX conversion by @grafail in https://github.com/microsoft/hummingbird/pull/631
    • Add support for Tweedie, Poisson and Gamma regressors by @interesaaat in https://github.com/microsoft/hummingbird/pull/650

    Maintenance and fixes

    • Remove pinned version for onnxconverter-common by @interesaaat in https://github.com/microsoft/hummingbird/pull/618
    • Bump action/cache to v3 by @mshr-h in https://github.com/microsoft/hummingbird/pull/619
    • Fix deprecation warnings for sklearn/scipy by @mshr-h in https://github.com/microsoft/hummingbird/pull/610
    • deprecating email due to spammers.... by @ksaur in https://github.com/microsoft/hummingbird/pull/621
    • updating ubuntu version by @ksaur in https://github.com/microsoft/hummingbird/pull/623
    • Fix linear models conversion when fit_intercept set to False by @RomanBredehoft in https://github.com/microsoft/hummingbird/pull/630
    • Corrects some typos by @RomanBredehoft in https://github.com/microsoft/hummingbird/pull/628
    • allow derived types of DataFrame by @liangfu in https://github.com/microsoft/hummingbird/pull/637
    • [TVM] Unify load params interface by @liangfu in https://github.com/microsoft/hummingbird/pull/640
    • Update TVM to 0.10 by @mshr-h in https://github.com/microsoft/hummingbird/pull/642
    • Update to pytorch 1.13 by @interesaaat in https://github.com/microsoft/hummingbird/pull/646
    • Update actions/checkout and actions/setup-python by @mshr-h in https://github.com/microsoft/hummingbird/pull/647
    • update codecov/codecov-action to v3 by @mshr-h in https://github.com/microsoft/hummingbird/pull/648

    New Contributors

    • @fd0r made their first contribution in https://github.com/microsoft/hummingbird/pull/625
    • @RomanBredehoft made their first contribution in https://github.com/microsoft/hummingbird/pull/630
    • @grafail made their first contribution in https://github.com/microsoft/hummingbird/pull/631
    • @liangfu made their first contribution in https://github.com/microsoft/hummingbird/pull/637

    Special Thanks

    • Thank you to @mshr-h for the continued support!!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.5...v0.4.6

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.6.tar.gz(532.37 KB)
    hummingbird_ml-0.4.6-py2.py3-none-any.whl(158.58 KB)
  • v0.4.5(Aug 5, 2022)

    What's Changed

    • Update _decomposition_implementations.py by @interesaaat in https://github.com/microsoft/hummingbird/pull/578
    • revise example in readme by @vumichien in https://github.com/microsoft/hummingbird/pull/579
    • Fix problem with new onnx and protobuf by @interesaaat in https://github.com/microsoft/hummingbird/pull/582
    • Bump TVM to v0.8 by @mshr-h in https://github.com/microsoft/hummingbird/pull/581
    • Fix the things broken by SKL==1.1.1 by @ksaur in https://github.com/microsoft/hummingbird/pull/588
    • Use onnxmltools>=1.6.0,<=1.11.0 instead of onnxmltools>=1.6.0 by @mshr-h in https://github.com/microsoft/hummingbird/pull/592
    • Deprecating Python3.7; updating sklearn-onnx version by @ksaur in https://github.com/microsoft/hummingbird/pull/593
    • deprecate torch1.7, push to macOS11 by @ksaur in https://github.com/microsoft/hummingbird/pull/594
    • Fix doc gen by @ksaur in https://github.com/microsoft/hummingbird/pull/596
    • Fix broken link by @mshr-h in https://github.com/microsoft/hummingbird/pull/599
    • Fixing documentation generation bug by @ksaur in https://github.com/microsoft/hummingbird/pull/598
    • Update Dockerfile by @mshr-h in https://github.com/microsoft/hummingbird/pull/601
    • Use a Microsoft compliant image for docker by @interesaaat in https://github.com/microsoft/hummingbird/pull/602
    • testing torch==1.12 by @ksaur in https://github.com/microsoft/hummingbird/pull/603
    • Use n_features_in_ instead of n_features_ by @mshr-h in https://github.com/microsoft/hummingbird/pull/604
    • Use python -m pip instead of the pip executable by @mshr-h in https://github.com/microsoft/hummingbird/pull/605
    • check for Sklearn model NotFitted before conversion by @SangamSwadiK in https://github.com/microsoft/hummingbird/pull/607
    • Upgrade prophet to v1.1 by @mshr-h in https://github.com/microsoft/hummingbird/pull/608
    • Add Python 3.10 by @mshr-h in https://github.com/microsoft/hummingbird/pull/586
    • Bump TVM to v0.9 by @mshr-h in https://github.com/microsoft/hummingbird/pull/609
    • Pinning onnxconverter-common to avoid dep by @ksaur in https://github.com/microsoft/hummingbird/pull/614
    • Use pre-installed LLVM for building TVM on macOS-11 by @mshr-h in https://github.com/microsoft/hummingbird/pull/615

    New Contributors

    • @vumichien made their first contribution in https://github.com/microsoft/hummingbird/pull/579
    • @mshr-h made their first contribution in https://github.com/microsoft/hummingbird/pull/581
    • @SangamSwadiK made their first contribution in https://github.com/microsoft/hummingbird/pull/607

    Special thanks

    Extra special thanks to @mshr-h for the work maintaining our version dependencies and pipeline!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.4...v0.4.5

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.5.tar.gz(531.49 KB)
    hummingbird_ml-0.4.5-py2.py3-none-any.whl(158.65 KB)
  • v0.4.4(Apr 25, 2022)

    This minor release includes bug fixes for performance and external dependency updates.

    What's Changed

    • Verified compatibility with Torch 1.11 released March10 by @ksaur in https://github.com/microsoft/hummingbird/pull/572
    • Fix xgboost tests for xgb > 1.6.0 by @interesaaat in https://github.com/microsoft/hummingbird/pull/576
    • Fix for KernelPCA using GPU by @interesaaat in https://github.com/microsoft/hummingbird/pull/575

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.3...v0.4.4

    New Contributors

    • Thanks to @carolinemckee for the bug report on KPCA #574
    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.4.tar.gz(93.60 KB)
    hummingbird_ml-0.4.4-py2.py3-none-any.whl(177.22 KB)
  • v0.4.3(Mar 11, 2022)

    This minor release includes bug fixes and external dependency updates.

    What's Changed

    • Fixed a problem with pandas with the latest xgb models in https://github.com/microsoft/hummingbird/pull/562
    • Minor changes to tests related to verifying that new versions of torch, scikit-learn, and Python3.9 work in #554, #560
    • allow tree_implementation="gemm" with onnx backend by @jfrery in https://github.com/microsoft/hummingbird/pull/566

    New Contributors

    • @jfrery made their first contribution in https://github.com/microsoft/hummingbird/pull/566
    • @shubh0508 contributed a bug fix in https://github.com/microsoft/hummingbird/pull/568

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.2...v0.4.3

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.3.tar.gz(93.58 KB)
    hummingbird_ml-0.4.3-py2.py3-none-any.whl(177.22 KB)
  • v0.4.2(Dec 14, 2021)

    This minor release includes a new operator, some improvements, and some fixes due to external dependency updates.

    New Operator:

    • Added support for PLSRegressor (#549)

    Improvements:

    • Better delete (with location) for saved models (#558)
    • Doc updates: Installation instructions for Fedora-based distros (#543) and readme update (#550)

    External dependency management:

    • Hummingbird now works with scikit-learn==1.0.x (#545) and the current version of scipy (scipy==1.7.x) (#552)
    • Note that there is currently an open issue (#556) with Onnxruntime and torch==1.10.x on Linux/Mac that causes the tests to hang (for static length dimensions) and fail (for dynamic dimensions). This will be fixed in a subsequent release.

    Credits:

    Thanks to @akshjain83 and @bibhabasumohapatra for their contributions!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.2.tar.gz(93.24 KB)
    hummingbird_ml-0.4.2-py2.py3-none-any.whl(176.85 KB)
  • v0.4.1(Aug 31, 2021)

  • v0.4.0(Jun 22, 2021)

    This release includes integration with Prophet, bug/usability fixes, and versioning fixes.

    We are excited to announce trend prediction support for Prophet! See our notebook for examples.

    New features:

    • Prophet integration (trend prediction) (#519)

    Bug fixes/Usability fixes:

    • Fix several numerical precision issues in tree-based models (#511)
    • Better assertions and error messages (#514/#521)

    Versioning fixes:

    • Remove constraint on dependencies version (#523)

    Special thanks:

    • Extra special thanks to @xadupre for helping us resolve library dependency issues
    • Thanks to @pnhathuy07 for helping us clean up our asserts
    • Thanks to @scnakandala for the ongoing contributions
    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.0.tar.gz(92.41 KB)
    hummingbird_ml-0.4.0-py2.py3-none-any.whl(176.28 KB)
  • v0.3.1(Apr 24, 2021)

    This patch release includes several improvements related to load/store of a model:

    Improvements:

    • Better error messages on load/save problems (#499)
    • Auto-clean temp directory on load/save (#502)
    • Using pickle instead of dill for model load/save to enable the use of Spark broadcast to share the model (#498)

    Other Notes:

    It seems the most recent release of libomp broke the pipeline for MacOS python3.6 and 3.7. We fixed this by pinning to an older version of libomp (#500). While not directly related to Hummingbird, this information may be useful to MacOS users with older versions of python wanting to use the lightgbm Hummingbird converters.

    Credits:

    Thanks to @dbanda for the feedback and testing on our load/store.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.3.1.tar.gz(90.53 KB)
    hummingbird_ml-0.3.1-py2.py3-none-any.whl(173.68 KB)
  • v0.3.0(Apr 13, 2021)

    This release includes many new operators, version upgrades, and minor bug fixes.

    New Features:

    • ONNXML imputer (#459)
    • SKL Bagging Classifier/Regressor (#490)
    • SKL GridSearchCV and RandomizedGridSearchCV (#476)
    • SKL KMeans (#472)
    • SKL MeanShift (#473)
    • SKL Stack Classifier and Regressor (#471)
    • SKL RidgeCV and LinearSVR (#470)
    • New example notebooks (#462) (#461)

    Versioning changes:

    • Bumped numpy from 1.19.4 to <=1.20.*
    • Bumped torch to 1.8.1

    Notable bug fixes:

    • Fixed error when pandas is passed as input to predict but no test_input provided (#487)

    Credits:

    Thanks to our first-time contributor @rathijit for fixes to the README.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.3.0.tar.gz(89.54 KB)
    hummingbird_ml-0.3.0-py2.py3-none-any.whl(172.17 KB)
  • v0.2.2(Mar 9, 2021)

    This release includes several bug fixes, version upgrades, and minor feature upgrades.

    Versioning changes:

    • Upgrading to Torch 1.8.0
    • Deprecating Python 3.5

    Notable bug fixes:

    • Fix save to pytorch for onnx models with trees (#425)
    • Fix bug with degenerate trees (#426)
    • Fix bloated models when saving (#429)

    Also included are fixes to test flakiness and error messages.

    Feature upgrades:

    • Add support for modified huber loss in SGD (#415)
    • Simplify the discretizer logic (#448)

    Credits:

    Thanks to our first-time contributors @qin-xiong and @sleepy-owl

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.2.2.tar.gz(86.85 KB)
    hummingbird_ml-0.2.2-py2.py3-none-any.whl(167.38 KB)
  • v0.2.1(Jan 4, 2021)

  • v0.2.0(Dec 29, 2020)

    Announcing: TVM Support

    • This release adds support for TVM (#236), giving us our fastest speeds yet!

    New Features

    • Adding save/load features for models (#399)
    • BatchContainer for batch by batch prediction use case (#377)
    • Native support for string features (#396)
    • Add batch_benchmark option to do benchmark on a single batch (#369)

    New Operators

    • Binarizer for ONNX-ML (#353)
    • Feature Vectorizer for ONNX-ML (#385)
    • Label Encoder for ONNX-ML and SKL (#374)

    Credits

    Thank you to @scnakandala and @masahi for their ongoing contributions!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.2.0.tar.gz(82.18 KB)
    hummingbird_ml-0.2.0-py2.py3-none-any.whl(147.81 KB)
  • v0.1.0(Oct 30, 2020)

    Announcing: Integration with PySpark.ML (pyspark)

    In this release, we added PySpark.ML support, which will open new doors for collaboration in the Spark space! (#310)

    So far, we support Bucketizer, VectorAssembler, and LogisticRegressionModel. We look forward to adding more operators soon!

    Announcing: Pandas Dataframes Support

    This release also adds support for Pandas Dataframes both at conversion time and inference time (#300).

    New Features

    • We added benchmarks (#328, #330, #331) from our OSDI paper
    • We also have a variety of features and improvements to the user experience:
      • Added the capability of setting number of threads to the model container (#319)
      • Removed the need for requirements on providing input schemas for ONNX models (#334)
      • Added support for ONNX models with multiple inputs (#339)
      • Added batching to output containers (#323)

    New Operators

    • scikit-learn KNeighbors Classifier (#296) and Regressor (#303)

    Credits

    Thank you to @scnakandala for SparkML and the additional scikit-learn operators! Thank you to @vumichien for contributing the README diagrams.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.1.0.tar.gz(73.08 KB)
    hummingbird_ml-0.1.0-py2.py3-none-any.whl(113.84 KB)
  • v0.0.6(Sep 10, 2020)

    This release adds a huge batch of new operators to scikit-learn, including pipeline support! It also includes bug fixes.

    New Features

    • Basic support for scikit-learn Pipeline's including FeatureUnion and ColumnTransformer (#251)

    New Operators - scikit-learn

    • Binarizer (#258)
    • KBinsDiscretizer (#285)
    • Matrix Decomposition Operators (#277)
      • FastICA
      • KernelPCA
      • PCA
      • TruncatedSVD
    • MissingIndicator (#268)
    • MLPClassifer (#260)
    • MLPRegressor (#288)
    • Other Classifiers: (#260)
      • BernoulliNB
      • GaussianNB
      • MultinomialNB
    • SelectKBest chi2 support (#262)
    • SelectPercentile (#263)
    • SimpleImputer (#267)
    • PolynomialFeatures (#269)

    Other updates:

    • We now support xgboost>=0.90 (#253)
    • Added optimized_execution to torchscript backend (#276)
    • We now allow older version of scikit-learn for compatibility reasons (#274)

    Bug fixes:

    • Logistic Regression with the lbfgs option (#261)
    • Empty trees and multiclass (#265)
    • Fix isolation forest for sklearn <= 0.21 (#290)

    Credits

    A big thank you to our contributors: @scnakandala, @zhanjiezhu Extra big thanks to @scnakandala for all of the scikit-learn converters!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.0.6.tar.gz(48.55 KB)
    hummingbird_ml-0.0.6-py2.py3-none-any.whl(76.07 KB)
  • v0.0.5(Aug 21, 2020)

    This release adds TorchScript as a backend, removes the problematic auto-installation of pytorch, improves syntax for the ONNX converter, adds notebook enhancements, and adds 3rd party library version upgrades. Hummingbird now provides the same Sklearn API across backends for executing inference.

    Announcing: Integration with TorchScript

    This release adds TorchScript as a backend (#225).

    Users can convert models with:

    hummingbird.ml.convert(model, "torchscript", X)
    

    Installation changes:

    After several reports from users across multiple platforms (in terms of both OS and underlying hardware), we changed the Hummingbird installer to require users to first install pytorch before installing Hummingbird (#246). This allows users to select the right pytorch version for a specific platform and should simplify the installation process and remove issues caused by having the wrong pytorch version installed.

    ONNX API

    For ONNX, we changed the API to have a more seamless experience, allowing users to interact with ONNX models in a consistent way with other models in Hummingbird (#231).

    Instead of the user having to instanciate the ONNX runtime session:

    -        session = ort.InferenceSession(onnx_model.SerializeToString())
    -        onnx_pred = session.run(output_names, inputs)
    

    The user can now just call predict, predict_proba, transform, etc. as with other Hummingbird conversions.

    +        onnx_pred = onnx_model.predict(X)
    

    New Operators

    • OneHotEncoder - integers (#197)
    • Support for Tweedie distribution in LGBM (#242)

    Miscellaneous

    • The target opset for ONNX is now 11 (#214)
    • The target pytorch version is now 1.6.0, except for with Python 3.5 it remains at 1.5.1 for compatibility reasons (#213)
    • Docs are now auto-generated (#223)

    Credits

    Thanks to @KranthiGV for the updated LGBM ONNX notebook example

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.5-py2.py3-none-any.whl(58.75 KB)
  • v0.0.4(Jul 21, 2020)

    This release adds several new operators to both scikit-learn and Onnx.

    New Features

    • float 64 support [#186]
    • Better windows installation support [#179]

    New Operators - scikit-learn

    • IsolationForest [#191]
    • LGBMRanker [#173]
    • OneHotEncoder [#193]
    • Scaler [#171]
    • XBGRanker [#189]

    New Operators - Onnx

    • ArrayFeatureExtractor [#198]
    • Linear Classifier/Regressor [#190, #194]
    • Normalizer [#188]
    • Scaler [#196]

    Credits

    This release would not have been possible without the following contributors: @ahmedkrmn, @KranthiGV, @TuanNguyen27, @zhanjiezhu

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.0.4.tar.gz(35.51 KB)
    hummingbird_ml-0.0.4-py2.py3-none-any.whl(53.82 KB)
    hummingbird_ml-0.0.4-py2.py3-none-win32.whl(99.51 KB)
    hummingbird_ml-0.0.4-py2.py3-none-win_amd64.whl(99.51 KB)
  • v0.0.3(Jun 19, 2020)

    This release adds several new cool features and bug fixes to Hummingbird!

    API Changes

    When selecting the backend to use for conversion, we renamed pytorch into torch (to match the module name). [#142]

    New Operators

    • HistGradientBoostingRegressor [#135 ]
    • LinearRegression [#140 ]
    • LinearSVC [#140 ]
    • LogisticRegression [#140 ]
    • LogisticRegressionCV [#140 ]
    • Normalizer [#126]

    New Features

    • transform method is added to the PyTorch container to match the transformer API of Sklearn. [#148 ]
    • Support for ONNX models as input (at the moment this feature only works in combination with the lightgbm_converter in ONNXMLTOOLS) [#142 ]
    • Generation of ONNX models as output (at the moment this feature only works when a ONNX model is passed as input) [#142]

    Credits

    This release would not have been possible without the following contributors: @ahmedkrmn, @jspisak, and @TuanNguyen27.

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.3-py2.py3-none-any.whl(41.47 KB)
  • v0.0.2(Jun 10, 2020)

    This release adds several new operators, an updated API, and contains several documentation fixes.

    New Operators

    • DecisionTreeRegressor [#102 ]
    • ExtraTreesRegressor [#91 ]
    • GradientBoostingRegressor [#88 ]
    • HistGradientBoostingClassifier [#87]

    Credits

    Special thanks to following contributors: @KranthiGV (DecisionTreeRegressor), @mmbhatk (ExtraTreesRegressor), @bfgray3 (GradientBoostingRegressor), and @ahmedkrmn (HistGradientBoostingClassifier)

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.2-py2.py3-none-any.whl(27.06 KB)
  • v0.0.1(May 7, 2020)

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Few-shot 3D Point Cloud Semantic Segmentation Created by Na Zhao from National University of Singapore Introduction This repository contains the PyTor

117 Dec 27, 2022
Using OpenAI's CLIP to upscale and enhance images

CLIP Upscaler and Enhancer Using OpenAI's CLIP to upscale and enhance images Based on nshepperd's JAX CLIP Guided Diffusion v2.4 Sample Results Viewpo

Tripp Lyons 5 Jun 14, 2022
Code release for The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification (TIP 2020)

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification Code release for The Devil is in the Channels: Mutual-Channel

PRIS-CV: Computer Vision Group 230 Dec 31, 2022
Multi-angle c(q)uestion answering

Macaw Introduction Macaw (Multi-angle c(q)uestion answering) is a ready-to-use model capable of general question answering, showing robustness outside

AI2 430 Jan 04, 2023
A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022
Pairwise learning neural link prediction for ogb link prediction

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
VR-Caps: A Virtual Environment for Active Capsule Endoscopy

VR-Caps: A Virtual Environment for Capsule Endoscopy Overview We introduce a virtual active capsule endoscopy environment developed in Unity that prov

DeepMIA Lab 90 Dec 27, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

2 Jan 11, 2022
Link prediction using Multiple Order Local Information (MOLI)

Understanding the network formation pattern for better link prediction Authors: [e

Wu Lab 0 Oct 18, 2021
Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

unfoldedVBA Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution This repository contains the Pytorch implementation of the unrolled

Yunshi HUANG 2 Jul 10, 2022
IndoNLI: A Natural Language Inference Dataset for Indonesian

IndoNLI: A Natural Language Inference Dataset for Indonesian This is a repository for data and code accompanying our EMNLP 2021 paper "IndoNLI: A Natu

15 Feb 10, 2022
Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Class-Balanced Loss Based on Effective Number of Samples Tensorflow code for the paper: Class-Balanced Loss Based on Effective Number of Samples Yin C

Yin Cui 546 Jan 08, 2023
Face Detection & Age Gender & Expression & Recognition

Face Detection & Age Gender & Expression & Recognition

Sajjad Ayobi 188 Dec 28, 2022
Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

Kevin Wilkinghoff 6 Dec 01, 2022
Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user who joins your server.

Discord-Protect Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user wh

Tir Omar 2 Oct 28, 2021
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

End-to-End Object Detection with Learnable Proposal, CVPR2021

Peize Sun 1.2k Dec 27, 2022
Relative Positional Encoding for Transformers with Linear Complexity

Stochastic Positional Encoding (SPE) This is the source code repository for the ICML 2021 paper Relative Positional Encoding for Transformers with Lin

Antoine Liutkus 48 Nov 16, 2022
Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

DreamerPro Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFl

22 Nov 01, 2022
Interpretation of T cell states using reference single-cell atlases

Interpretation of T cell states using reference single-cell atlases ProjecTILs is a computational method to project scRNA-seq data into reference sing

Cancer Systems Immunology Lab 139 Jan 03, 2023
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Optimized Einsum Optimized Einsum: A tensor contraction order optimizer Optimized einsum can significantly reduce the overall execution time of einsum

Daniel Smith 653 Dec 30, 2022