Hummingbird compiles trained ML models into tensor computation for faster inference.

Overview

Hummingbird

PyPI version coverage Gitter Downloads


Introduction

Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to seamlessly leverage neural network frameworks (such as PyTorch) to accelerate traditional ML models. Thanks to Hummingbird, users can benefit from: (1) all the current and future optimizations implemented in neural network frameworks; (2) native hardware acceleration; (3) having a unique platform to support for both traditional and neural network models; and have all of this (4) without having to re-engineer their models.

Currently, you can use Hummingbird to convert your trained traditional ML models into PyTorch, TorchScript, ONNX, and TVM). Hummingbird supports a variety of ML models and featurizers. These models include scikit-learn Decision Trees and Random Forest, and also LightGBM and XGBoost Classifiers/Regressors. Support for other neural network backends and models is on our roadmap.

Hummingbird also provides a convenient uniform "inference" API following the Sklearn API. This allows swapping Sklearn models with Hummingbird-generated ones without having to change the inference code. By converting the models to PyTorch and TorchScript it also becomes possible to serve them using TorchServe.

How Hummingbird Works

Hummingbird works by reconfiguring algorithmic operators such that we can perform more regular computations which are amenable to vectorized and GPU execution. Each operator is slightly different, and we incorporate multiple strategies. This example explains one of Hummingbird's strategies for translating a decision tree into tensors involving GEMM (GEneric Matrix Multiplication), where we implement the traversal of the tree using matrix multiplications. (GEMM is one of the three tree conversion strategies we currently support.)


Simple decision tree

In this example, the decision tree has four decision nodes (orange), and five leaf nodes (blue). The tree takes a feature vector with five elements as input. For example, assume that we want to calculate the output of this observation:

Step 1: Multiply the input tensor with tensor A (computed from the decision tree model above) that captures the relationship between input features and internal nodes. Then compare it with tensor B which is set to the value of each internal node (orange) to create the tensor input path that represents the path from input to node. In this case, the tree model has 4 conditions and the input vector is 5, therefore, the shape of tensor A is 5x4 and tensor B is 1x4.

Step 2: The input path tensor will be multiplied with tensor C that captures whether the internal node is a parent of that internal node, and if so, whether it is in the left or right sub-tree (left = 1, right =-1, otherwise =0) and then check the equals with tensor D that captures the count of the left child of its parent in the path from a leaf node to the tree root to create the tenor output path that represents the path from node to output. In this case, this tree model has 5 outputs with 4 conditions, therefore, the shape of tensor C is 4x5 and tensor D is 1x5.

Step 3: The output path will be multiplied with tensor E that captures the mapping between leaf nodes to infer the final prediction. In this case, tree model has 5 outputs, therefore, shape of tensor E is 5x1.

And now Hummingbird has compiled a tree-based model using the GEMM strategy! For more details, please see Figure 3 of our paper.

Thank you to Chien Vu for contributing the graphics and descriptions in his blog for this example!

Installation

Hummingbird was tested on Python >= 3.6 on Linux, Windows and MacOS machines. (Python 3.5 is suppored up to hummingbird-ml==0.2.1.) It is recommended to use a virtual environment (See: python3 venv doc or Using Python environments in VS Code.)

Hummingbird requires PyTorch >= 1.4.0. Please go here for instructions on how to install PyTorch based on your platform and hardware.

Once PyTorch is installed, you can get Hummingbird from pip with:

pip install hummingbird-ml

If you require the optional dependencies lightgbm and xgboost, you can use:

pip install hummingbird-ml[extra]

See also Troubleshooting for common problems.

Examples

See the notebooks section for examples that demonstrate use and speedups.

In general, Hummingbird syntax is very intuitive and minimal. To run your traditional ML model on DNN frameworks, you only need to import hummingbird.ml and add convert(model, 'dnn_framework') to your code. Below is an example using a scikit-learn random forest model and PyTorch as target framework.

import numpy as np
from sklearn.ensemble import RandomForestClassifier
from hummingbird.ml import convert, load

# Create some random data for binary classification
num_classes = 2
X = np.random.rand(100000, 28)
y = np.random.randint(num_classes, size=100000)

# Create and train a model (scikit-learn RandomForestClassifier in this case)
skl_model = RandomForestClassifier(n_estimators=10, max_depth=10)
skl_model.fit(X, y)

# Use Hummingbird to convert the model to PyTorch
model = convert(skl_model, 'pytorch')

# Run predictions on CPU
model.predict(X)

# Run predictions on GPU
model.to('cuda')
model.predict(X)

# Save the model
model.save('hb_model')

# Load the model back
model = load('hb_model')

Documentation

The API documentation is here.

You can also read about Hummingbird in our blog post here.

For more details on the vision and on the technical details related to Hummingbird, please check our papers:

Contributing

We welcome contributions! Please see the guide on Contributing.

Also, see our roadmap of planned features.

Community

Join our community! Gitter

For more formal enquiries, you can contact us.

Authors

  • Supun Nakandala
  • Matteo Interlandi
  • Karla Saur

License

MIT License

Comments
  • [WIP] Add sklearn's HistGradientBoosting

    [WIP] Add sklearn's HistGradientBoosting

    Closes #64:

    • This PR adds sklearn's HistGradientBoostingClassifier to hummingbird.

    • It modifies the two functions convert_sklearn_gbdt_classifier() and get_parameters_for_sklearn_common() to handle the HistGradientBoostingClassifier model attributes.

    • It also updates the documentation to include the new HistGradientBoostingClassifier.

    opened by ahmedkrmn 27
  • Support lightgbm >= 3

    Support lightgbm >= 3

    setup.py requires lightgbm < 3, I'm using the current version of lightgbm so hummingbird reports that it's not installed. Is there any blocking incompatibility or is it just that the requirement version is out of date?

    opened by memeplex 18
  • AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'

    AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'

    Code:

    hummingmodel = hummingbird.ml.operator_converters.xgb.convert_sklearn_xgb_classifier(model, 'pytorch',extra_config={"n_features":18})

    Error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /var/folders/f2/9tbmpg411hndwc482xn850br0000gn/T/ipykernel_2889/1708670718.py in <module>
          1 # Use Hummingbird to convert the model to PyTorch
    ----> 2 hummingmodel = hummingbird.ml.operator_converters.xgb.convert_sklearn_xgb_classifier(model, 'pytorch',extra_config={"n_features":18})
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/operator_converters/xgb.py in convert_sklearn_xgb_classifier(operator, device, extra_config)
        102              Please pass "n_features:N" as extra configuration to the converter or fill a bug report.'
        103         )
    --> 104     tree_infos = operator.raw_operator.get_booster().get_dump()
        105     n_classes = operator.raw_operator.n_classes_
        106 
    
    AttributeError: 'XGBClassifier' object has no attribute 'raw_operator'
    

    XGB Version: 1.6.1 Hummingbird Version: 0.4.7

    Any idea about this issue? What other configurations are required to make this work?

    opened by dintellect 17
  • Add float64 support in hummingbird

    Add float64 support in hummingbird

    Addresses issue #51

    This PR:

    • Add support for float64 input data based on discussion in issue #51

    It appears Pytorch expects both inputs and model params (weights) to be of the same data type. The weights are in float32. In this failure case, the input is float64 type. In order to have the same type, it is easier (and computationally efficient) to down cast the inputs.

    • Add a few tests around the same change.

    Pending:

    • I did validate other supported operators as well (Sklearn GBDT, HGBDT; XGB, etc). Should I add tests for them as well?
    • I feel we can update the notebooks (and other examples) as well removing the casting from float64 to float32. This makes the example simpler and less mysterious. Should I do that?

    Validated that all we can now use float32/float64 data.

    opened by KranthiGV 16
  • Fixing test flakiness

    Fixing test flakiness

    Hi,

    The test test_tree_regressors_multioutput_regression is flaky. It failed 41/3000 times that I ran. It seems the absolute difference can be much higher than the current threshold (1e-5).

    Empirically, I observed a value of a maximum absolute difference of upto 3.7. Based on my experiments, the 99th percentile seems to be close to 4.5. Hence, I set this bound accordingly.

    Please let me know if this makes sense. I am assuming there are no bugs in this case. I would be happy to incorporate any other suggestions that you may have.

    Thanks!

    opened by sleepy-owl 14
  • pip install of Version 0.2.0 dosen't work

    pip install of Version 0.2.0 dosen't work

    I try'd to install the new hummingbird 0.2.0 release in a new venv and the installation failed because the hummingbird wheel specified a pytorch version that seems to have a broken wheel or the hummingbird wheel is missing the full pytorch installation command (which is: pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html). It's also not possible to get this exact pytorch install command from the official pytorch website. So its not possible to install the requirered pytorch version by just looking at the pytorch website where you only can get the 1.7.1 or the 1.6.0 version of pytorch.

    Right now you have to find the pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html command somewhere in the internet and install this specific version before you can install hummingbird-ml

    Collecting hummingbird-ml
      Downloading hummingbird_ml-0.2.0-py2.py3-none-any.whl (151 kB)
    Collecting numpy<=1.19.4,>=1.15
      Using cached numpy-1.19.4-cp37-cp37m-win_amd64.whl (12.9 MB)
    Collecting onnxconverter-common<=1.7.0,>=1.6.0
      Using cached onnxconverter_common-1.7.0-py2.py3-none-any.whl (64 kB)
    Collecting scikit-learn<=0.23.2,>=0.21.3
      Using cached scikit_learn-0.23.2-cp37-cp37m-win_amd64.whl (6.8 MB)
    Collecting joblib>=0.11
      Downloading joblib-1.0.0-py3-none-any.whl (302 kB)
    Collecting scipy>=0.19.1
      Downloading scipy-1.6.0-cp37-cp37m-win_amd64.whl (32.5 MB)
    Collecting threadpoolctl>=2.0.0
      Using cached threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
    Collecting torch<=1.7.0,>=1.4.*
      Downloading torch-1.7.0-cp37-cp37m-win_amd64.whl (184.0 MB)
    
    ERROR: torch has an invalid wheel, .dist-info directory not found
    
    opened by speedfreakw 13
  • float64 issue

    float64 issue

    At the moment, HB only works with float32. You must cast float64 to float32 for correct results. (You will get an error with the gemm algorithm). We need to fix this.

    Ex: in scikit-learn-random-forest-example.ipynb we must cast X as follows: X = X[0:nrows].astype('|f4')

    opened by ksaur 13
  • Add delete location bool param to PyTorchContainer load() call

    Add delete location bool param to PyTorchContainer load() call

    What? Adds delete_unzip_location_folder param to load() method of PyTorchSklearnContainer to avoid implicit deletion of model artifact supplied to load() method. The changes in this PR are backward compatible.

    Why? Related issue: https://github.com/microsoft/hummingbird/issues/557

    opened by akshjain83 12
  • Performance Issues Using the TVM Backend

    Performance Issues Using the TVM Backend

    I am running inference on an XGBoost model using Hummingbird on a desktop CPU target. I've installed Pytorch, Hummingbird and TVM into a conda environment (TVM was built from source and linked to llvm-10). I have trained models serialized into XGBoost JSONs and am creating XGBoost sklearn objects from them. I am able to compile these models using Hummingbird and run them. My python code to run on PyTorch is as follows:

    xgb_model = xgb.XGBRegressor() xgb_model.load_model(model_json) hb_model = convert(xgb_model, 'pytorch') // ... pred = hb_model.predict(batch)

    My python code to compile the model using the TVM backend is the following:

    xgb_model = xgb.XGBRegressor() xgb_model.load_model(model_json) hb_model = convert(xgb_model, 'tvm', test_input=inputs[0:batch_size]) // ... pred = hb_model.predict(batch)

    Is this the right way to compile models (especially using TVM)? Do I need to enable any other TVM features (eg. auto-tuning) through the Hummingbird API? For my models, the performance of the inference with both the PyTorch and the TVM backends are roughly the same which makes me think that I may be using the TVM backend wrong. I was able to verify that the predict methods are computing the correct predictions.

    opened by asprasad 11
  • KernelPCA + PyTorch

    KernelPCA + PyTorch

    Hello, I'm trying to utilize GPU with Pytorch backend to speed up a Kernel PCA operation. However, when I convert to Pytorch, it ends up taking about 9x longer to run the .transform() function. Additionally, I'm not seeing any GPU utilization at all. Sklearn: 0.8 seconds Pytorch + CPU: 7.8 seconds Pytorch + GPU (supposedly, but again, not seeing any GPU utilization): 7.9 seconds

    Would it be possible for you to look into this? Have already checked that CUDA is configured correctly with torch.cuda.is_available(). Thanks!

    bug 
    opened by carolinemckee 11
  • CUDA out of memory

    CUDA out of memory

    Hi there,

    When I'm testing Hummingbird with GPU for more and deeper trees (worked fine if I have less trees), I got OOM error:

    "CUDA out of memory. Tried to allocate 5.96 GiB (GPU 0; 11.91 GiB total capacity; 6.47 GiB already allocated; 4.74 GiB free; 6.48 GiB reserved in total by PyTorch)"

    I'm using: CUDA 10.1, Pytorch 1.5.1+cu101 RandomForestRegressor from sklearn total number of samples: 800k number of features: 100 number of trees: 1000 average tree depth: 15 average number of nodes per tree: 3700

    Is it expected? Any idea how to make it work for more and deeper trees?

    Thanks!

    opened by zhanjiezhu 11
  • MissingConverter: Unable to find converter <class 'sklearn.preprocessing._encoders.OrdinalEncoder'>

    MissingConverter: Unable to find converter

    ---------------------------------------------------------------------------
    MissingConverter                          Traceback (most recent call last)
    /var/folders/f2/9tbmpg411hndwc482xn850br0000gn/T/ipykernel_27005/3005074338.py in <module>
    ----> 1 hb_model = convert(clf, 'torch',X_train[0:1])
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in convert(model, backend, test_input, device, extra_config)
        442     """
        443     assert constants.REMAINDER_SIZE not in extra_config
    --> 444     return _convert_common(model, backend, test_input, device, extra_config)
        445 
        446 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in _convert_common(model, backend, test_input, device, extra_config)
        403         return _convert_sparkml(model, backend_formatted, test_input, device, extra_config)
        404 
    --> 405     return _convert_sklearn(model, backend_formatted, test_input, device, extra_config)
        406 
        407 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/convert.py in _convert_sklearn(model, backend, test_input, device, extra_config)
        106     # We modify the scikit learn model during translation.
        107     model = deepcopy(model)
    --> 108     topology = parse_sklearn_api_model(model, extra_config)
        109 
        110     # Convert the Topology object into a PyTorch model.
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in parse_sklearn_api_model(model, extra_config)
         63     # Parse the input scikit-learn model into a topology object.
         64     # Get the outputs of the model.
    ---> 65     outputs = _parse_sklearn_api(topology, model, inputs)
         66 
         67     # Declare output variables.
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        228     tmodel = type(model)
        229     if tmodel in sklearn_api_parsers_map:
    --> 230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
        232         outputs = _parse_sklearn_single_model(topology, model, inputs)
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_pipeline(topology, model, inputs)
        274     """
        275     for step in model.steps:
    --> 276         inputs = _parse_sklearn_api(topology, step[1], inputs)
        277     return inputs
        278 
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        228     tmodel = type(model)
        229     if tmodel in sklearn_api_parsers_map:
    --> 230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
        232         outputs = _parse_sklearn_single_model(topology, model, inputs)
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_column_transformer(topology, model, inputs)
        451                 )
        452         else:
    --> 453             var_out = _parse_sklearn_api(topology, model_obj, transform_inputs)[0]
        454             if model.transformer_weights is not None and name in model.transformer_weights:
        455                 # Create a Multiply node
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_api(topology, model, inputs)
        230         outputs = sklearn_api_parsers_map[tmodel](topology, model, inputs)
        231     else:
    --> 232         outputs = _parse_sklearn_single_model(topology, model, inputs)
        233 
        234     return outputs
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/_parse.py in _parse_sklearn_single_model(topology, model, inputs)
        250         raise RuntimeError("Parameter model must be an object not a " "string '{0}'.".format(model))
        251 
    --> 252     alias = get_sklearn_api_operator_name(type(model))
        253     this_operator = topology.declare_logical_operator(alias, model)
        254     this_operator.inputs = inputs
    
    ~/opt/anaconda3/lib/python3.9/site-packages/hummingbird/ml/supported.py in get_sklearn_api_operator_name(model_type)
        463     """
        464     if model_type not in sklearn_api_operator_name_map:
    --> 465         raise MissingConverter("Unable to find converter for model type {}.".format(model_type))
        466     return sklearn_api_operator_name_map[model_type]
        467 
    
    MissingConverter: Unable to find a converter for model type <class 'sklearn.preprocessing._encoders.OrdinalEncoder'>.
    It usually means the pipeline being converted contains a
    transformer or a predictor with no corresponding converter implemented.
    Please fill an issue at https://github.com/microsoft/hummingbird.
    

    Which Scikit-learn pipeline operators do hummingbird support?

    enhancement 
    opened by dintellect 3
  • FLOPs counting for the converted model

    FLOPs counting for the converted model

    Could you please share some suggestions on FLOPs counting for the converted model?

    I have tried: thop : https://github.com/Lyken17/pytorch-OpCounter flop-counter : https://github.com/sovrasov/flops-counter.pytorch pthflops: https://github.com/1adrianb/pytorch-estimate-flops torchprofile: https://github.com/zhijian-liu/torchprofile deepspeed: https://www.deepspeed.ai/tutorials/flops-profiler/

    Seems non of them support the calculations of the converted models, any advice will be highly appreciated, thanks!

    opened by ChuniHiro 1
  • Kernel crashing while converting

    Kernel crashing while converting

    RF was previously defined.

    # Can crash sometimes
    # Convert model to pytroch with Hummingbird-ml
    RFconv = convert(RF, 'pytorch')
    
    # Save Model Converted
    RFconv.save(os.path.join(ResultsFolder, "RFmodel_V3"))
    

    When I ran this on macOS it crashes the Jupyter kernel, this doesn't happen in Windows.

    Screenshot 2022-12-15 at 16 14 19

    opened by EmanuelCastanho 4
  • Need new test dataset for xgb

    Need new test dataset for xgb

    Build workflow run

    ==================================== ERRORS ====================================
    _______________ ERROR collecting tests/test_xgboost_converter.py _______________
    ImportError while importing test module '/home/runner/work/hummingbird/hummingbird/tests/test_xgboost_converter.py'.
    Hint: make sure your test modules/packages have valid Python names.
    Traceback:
    /opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/importlib/__init__.py:127: in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
    tests/test_xgboost_converter.py:8: in <module>
        from sklearn.datasets import load_boston
    /opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/sklearn/datasets/__init__.py:156: in __getattr__
        raise ImportError(msg)
    E   ImportError: 
    E   `load_boston` has been removed from scikit-learn since version 1.2.
    
    opened by ksaur 0
  • Should SKLearn operators be assumed to produce a single output?

    Should SKLearn operators be assumed to produce a single output?

    See https://github.com/microsoft/hummingbird/blob/main/hummingbird/ml/_parse.py#L256

    Consider models which implement predict and predict_proba functions. These return both label and probabilities as outputs. The current logic means that we cannot name the outputs in the hummingbird conversion step (ie with output_names argument to extra_config) and instead have to perform some ONNX graph surgery afterwards.

    opened by stillmatic 0
Releases(v0.4.7)
  • v0.4.7(Nov 29, 2022)

    What's Changed

    • This patch release fixes a bug in ONNX conversion and allows support of varying batch sizes by @stillmatic in https://github.com/microsoft/hummingbird/pull/654
    • Fixes deprecations in https://github.com/microsoft/hummingbird/pull/655

    New Contributors

    • Thank you to @stillmatic for catching and fixing this bug (https://github.com/microsoft/hummingbird/issues/653) and for the additional maintenance work!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.6...v0.4.7

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.7.tar.gz(533.74 KB)
    hummingbird_ml-0.4.7-py2.py3-none-any.whl(158.37 KB)
  • v0.4.6(Nov 10, 2022)

    What's Changed

    Features

    • Adds Lasso, Ridge and ElasticNet by @fd0r in https://github.com/microsoft/hummingbird/pull/625
    • Added support for more decision conditions in trees and ONNX conversion by @grafail in https://github.com/microsoft/hummingbird/pull/631
    • Add support for Tweedie, Poisson and Gamma regressors by @interesaaat in https://github.com/microsoft/hummingbird/pull/650

    Maintenance and fixes

    • Remove pinned version for onnxconverter-common by @interesaaat in https://github.com/microsoft/hummingbird/pull/618
    • Bump action/cache to v3 by @mshr-h in https://github.com/microsoft/hummingbird/pull/619
    • Fix deprecation warnings for sklearn/scipy by @mshr-h in https://github.com/microsoft/hummingbird/pull/610
    • deprecating email due to spammers.... by @ksaur in https://github.com/microsoft/hummingbird/pull/621
    • updating ubuntu version by @ksaur in https://github.com/microsoft/hummingbird/pull/623
    • Fix linear models conversion when fit_intercept set to False by @RomanBredehoft in https://github.com/microsoft/hummingbird/pull/630
    • Corrects some typos by @RomanBredehoft in https://github.com/microsoft/hummingbird/pull/628
    • allow derived types of DataFrame by @liangfu in https://github.com/microsoft/hummingbird/pull/637
    • [TVM] Unify load params interface by @liangfu in https://github.com/microsoft/hummingbird/pull/640
    • Update TVM to 0.10 by @mshr-h in https://github.com/microsoft/hummingbird/pull/642
    • Update to pytorch 1.13 by @interesaaat in https://github.com/microsoft/hummingbird/pull/646
    • Update actions/checkout and actions/setup-python by @mshr-h in https://github.com/microsoft/hummingbird/pull/647
    • update codecov/codecov-action to v3 by @mshr-h in https://github.com/microsoft/hummingbird/pull/648

    New Contributors

    • @fd0r made their first contribution in https://github.com/microsoft/hummingbird/pull/625
    • @RomanBredehoft made their first contribution in https://github.com/microsoft/hummingbird/pull/630
    • @grafail made their first contribution in https://github.com/microsoft/hummingbird/pull/631
    • @liangfu made their first contribution in https://github.com/microsoft/hummingbird/pull/637

    Special Thanks

    • Thank you to @mshr-h for the continued support!!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.5...v0.4.6

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.6.tar.gz(532.37 KB)
    hummingbird_ml-0.4.6-py2.py3-none-any.whl(158.58 KB)
  • v0.4.5(Aug 5, 2022)

    What's Changed

    • Update _decomposition_implementations.py by @interesaaat in https://github.com/microsoft/hummingbird/pull/578
    • revise example in readme by @vumichien in https://github.com/microsoft/hummingbird/pull/579
    • Fix problem with new onnx and protobuf by @interesaaat in https://github.com/microsoft/hummingbird/pull/582
    • Bump TVM to v0.8 by @mshr-h in https://github.com/microsoft/hummingbird/pull/581
    • Fix the things broken by SKL==1.1.1 by @ksaur in https://github.com/microsoft/hummingbird/pull/588
    • Use onnxmltools>=1.6.0,<=1.11.0 instead of onnxmltools>=1.6.0 by @mshr-h in https://github.com/microsoft/hummingbird/pull/592
    • Deprecating Python3.7; updating sklearn-onnx version by @ksaur in https://github.com/microsoft/hummingbird/pull/593
    • deprecate torch1.7, push to macOS11 by @ksaur in https://github.com/microsoft/hummingbird/pull/594
    • Fix doc gen by @ksaur in https://github.com/microsoft/hummingbird/pull/596
    • Fix broken link by @mshr-h in https://github.com/microsoft/hummingbird/pull/599
    • Fixing documentation generation bug by @ksaur in https://github.com/microsoft/hummingbird/pull/598
    • Update Dockerfile by @mshr-h in https://github.com/microsoft/hummingbird/pull/601
    • Use a Microsoft compliant image for docker by @interesaaat in https://github.com/microsoft/hummingbird/pull/602
    • testing torch==1.12 by @ksaur in https://github.com/microsoft/hummingbird/pull/603
    • Use n_features_in_ instead of n_features_ by @mshr-h in https://github.com/microsoft/hummingbird/pull/604
    • Use python -m pip instead of the pip executable by @mshr-h in https://github.com/microsoft/hummingbird/pull/605
    • check for Sklearn model NotFitted before conversion by @SangamSwadiK in https://github.com/microsoft/hummingbird/pull/607
    • Upgrade prophet to v1.1 by @mshr-h in https://github.com/microsoft/hummingbird/pull/608
    • Add Python 3.10 by @mshr-h in https://github.com/microsoft/hummingbird/pull/586
    • Bump TVM to v0.9 by @mshr-h in https://github.com/microsoft/hummingbird/pull/609
    • Pinning onnxconverter-common to avoid dep by @ksaur in https://github.com/microsoft/hummingbird/pull/614
    • Use pre-installed LLVM for building TVM on macOS-11 by @mshr-h in https://github.com/microsoft/hummingbird/pull/615

    New Contributors

    • @vumichien made their first contribution in https://github.com/microsoft/hummingbird/pull/579
    • @mshr-h made their first contribution in https://github.com/microsoft/hummingbird/pull/581
    • @SangamSwadiK made their first contribution in https://github.com/microsoft/hummingbird/pull/607

    Special thanks

    Extra special thanks to @mshr-h for the work maintaining our version dependencies and pipeline!

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.4...v0.4.5

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.5.tar.gz(531.49 KB)
    hummingbird_ml-0.4.5-py2.py3-none-any.whl(158.65 KB)
  • v0.4.4(Apr 25, 2022)

    This minor release includes bug fixes for performance and external dependency updates.

    What's Changed

    • Verified compatibility with Torch 1.11 released March10 by @ksaur in https://github.com/microsoft/hummingbird/pull/572
    • Fix xgboost tests for xgb > 1.6.0 by @interesaaat in https://github.com/microsoft/hummingbird/pull/576
    • Fix for KernelPCA using GPU by @interesaaat in https://github.com/microsoft/hummingbird/pull/575

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.3...v0.4.4

    New Contributors

    • Thanks to @carolinemckee for the bug report on KPCA #574
    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.4.tar.gz(93.60 KB)
    hummingbird_ml-0.4.4-py2.py3-none-any.whl(177.22 KB)
  • v0.4.3(Mar 11, 2022)

    This minor release includes bug fixes and external dependency updates.

    What's Changed

    • Fixed a problem with pandas with the latest xgb models in https://github.com/microsoft/hummingbird/pull/562
    • Minor changes to tests related to verifying that new versions of torch, scikit-learn, and Python3.9 work in #554, #560
    • allow tree_implementation="gemm" with onnx backend by @jfrery in https://github.com/microsoft/hummingbird/pull/566

    New Contributors

    • @jfrery made their first contribution in https://github.com/microsoft/hummingbird/pull/566
    • @shubh0508 contributed a bug fix in https://github.com/microsoft/hummingbird/pull/568

    Full Changelog: https://github.com/microsoft/hummingbird/compare/v0.4.2...v0.4.3

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.3.tar.gz(93.58 KB)
    hummingbird_ml-0.4.3-py2.py3-none-any.whl(177.22 KB)
  • v0.4.2(Dec 14, 2021)

    This minor release includes a new operator, some improvements, and some fixes due to external dependency updates.

    New Operator:

    • Added support for PLSRegressor (#549)

    Improvements:

    • Better delete (with location) for saved models (#558)
    • Doc updates: Installation instructions for Fedora-based distros (#543) and readme update (#550)

    External dependency management:

    • Hummingbird now works with scikit-learn==1.0.x (#545) and the current version of scipy (scipy==1.7.x) (#552)
    • Note that there is currently an open issue (#556) with Onnxruntime and torch==1.10.x on Linux/Mac that causes the tests to hang (for static length dimensions) and fail (for dynamic dimensions). This will be fixed in a subsequent release.

    Credits:

    Thanks to @akshjain83 and @bibhabasumohapatra for their contributions!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.2.tar.gz(93.24 KB)
    hummingbird_ml-0.4.2-py2.py3-none-any.whl(176.85 KB)
  • v0.4.1(Aug 31, 2021)

  • v0.4.0(Jun 22, 2021)

    This release includes integration with Prophet, bug/usability fixes, and versioning fixes.

    We are excited to announce trend prediction support for Prophet! See our notebook for examples.

    New features:

    • Prophet integration (trend prediction) (#519)

    Bug fixes/Usability fixes:

    • Fix several numerical precision issues in tree-based models (#511)
    • Better assertions and error messages (#514/#521)

    Versioning fixes:

    • Remove constraint on dependencies version (#523)

    Special thanks:

    • Extra special thanks to @xadupre for helping us resolve library dependency issues
    • Thanks to @pnhathuy07 for helping us clean up our asserts
    • Thanks to @scnakandala for the ongoing contributions
    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.4.0.tar.gz(92.41 KB)
    hummingbird_ml-0.4.0-py2.py3-none-any.whl(176.28 KB)
  • v0.3.1(Apr 24, 2021)

    This patch release includes several improvements related to load/store of a model:

    Improvements:

    • Better error messages on load/save problems (#499)
    • Auto-clean temp directory on load/save (#502)
    • Using pickle instead of dill for model load/save to enable the use of Spark broadcast to share the model (#498)

    Other Notes:

    It seems the most recent release of libomp broke the pipeline for MacOS python3.6 and 3.7. We fixed this by pinning to an older version of libomp (#500). While not directly related to Hummingbird, this information may be useful to MacOS users with older versions of python wanting to use the lightgbm Hummingbird converters.

    Credits:

    Thanks to @dbanda for the feedback and testing on our load/store.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.3.1.tar.gz(90.53 KB)
    hummingbird_ml-0.3.1-py2.py3-none-any.whl(173.68 KB)
  • v0.3.0(Apr 13, 2021)

    This release includes many new operators, version upgrades, and minor bug fixes.

    New Features:

    • ONNXML imputer (#459)
    • SKL Bagging Classifier/Regressor (#490)
    • SKL GridSearchCV and RandomizedGridSearchCV (#476)
    • SKL KMeans (#472)
    • SKL MeanShift (#473)
    • SKL Stack Classifier and Regressor (#471)
    • SKL RidgeCV and LinearSVR (#470)
    • New example notebooks (#462) (#461)

    Versioning changes:

    • Bumped numpy from 1.19.4 to <=1.20.*
    • Bumped torch to 1.8.1

    Notable bug fixes:

    • Fixed error when pandas is passed as input to predict but no test_input provided (#487)

    Credits:

    Thanks to our first-time contributor @rathijit for fixes to the README.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.3.0.tar.gz(89.54 KB)
    hummingbird_ml-0.3.0-py2.py3-none-any.whl(172.17 KB)
  • v0.2.2(Mar 9, 2021)

    This release includes several bug fixes, version upgrades, and minor feature upgrades.

    Versioning changes:

    • Upgrading to Torch 1.8.0
    • Deprecating Python 3.5

    Notable bug fixes:

    • Fix save to pytorch for onnx models with trees (#425)
    • Fix bug with degenerate trees (#426)
    • Fix bloated models when saving (#429)

    Also included are fixes to test flakiness and error messages.

    Feature upgrades:

    • Add support for modified huber loss in SGD (#415)
    • Simplify the discretizer logic (#448)

    Credits:

    Thanks to our first-time contributors @qin-xiong and @sleepy-owl

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.2.2.tar.gz(86.85 KB)
    hummingbird_ml-0.2.2-py2.py3-none-any.whl(167.38 KB)
  • v0.2.1(Jan 4, 2021)

  • v0.2.0(Dec 29, 2020)

    Announcing: TVM Support

    • This release adds support for TVM (#236), giving us our fastest speeds yet!

    New Features

    • Adding save/load features for models (#399)
    • BatchContainer for batch by batch prediction use case (#377)
    • Native support for string features (#396)
    • Add batch_benchmark option to do benchmark on a single batch (#369)

    New Operators

    • Binarizer for ONNX-ML (#353)
    • Feature Vectorizer for ONNX-ML (#385)
    • Label Encoder for ONNX-ML and SKL (#374)

    Credits

    Thank you to @scnakandala and @masahi for their ongoing contributions!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.2.0.tar.gz(82.18 KB)
    hummingbird_ml-0.2.0-py2.py3-none-any.whl(147.81 KB)
  • v0.1.0(Oct 30, 2020)

    Announcing: Integration with PySpark.ML (pyspark)

    In this release, we added PySpark.ML support, which will open new doors for collaboration in the Spark space! (#310)

    So far, we support Bucketizer, VectorAssembler, and LogisticRegressionModel. We look forward to adding more operators soon!

    Announcing: Pandas Dataframes Support

    This release also adds support for Pandas Dataframes both at conversion time and inference time (#300).

    New Features

    • We added benchmarks (#328, #330, #331) from our OSDI paper
    • We also have a variety of features and improvements to the user experience:
      • Added the capability of setting number of threads to the model container (#319)
      • Removed the need for requirements on providing input schemas for ONNX models (#334)
      • Added support for ONNX models with multiple inputs (#339)
      • Added batching to output containers (#323)

    New Operators

    • scikit-learn KNeighbors Classifier (#296) and Regressor (#303)

    Credits

    Thank you to @scnakandala for SparkML and the additional scikit-learn operators! Thank you to @vumichien for contributing the README diagrams.

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.1.0.tar.gz(73.08 KB)
    hummingbird_ml-0.1.0-py2.py3-none-any.whl(113.84 KB)
  • v0.0.6(Sep 10, 2020)

    This release adds a huge batch of new operators to scikit-learn, including pipeline support! It also includes bug fixes.

    New Features

    • Basic support for scikit-learn Pipeline's including FeatureUnion and ColumnTransformer (#251)

    New Operators - scikit-learn

    • Binarizer (#258)
    • KBinsDiscretizer (#285)
    • Matrix Decomposition Operators (#277)
      • FastICA
      • KernelPCA
      • PCA
      • TruncatedSVD
    • MissingIndicator (#268)
    • MLPClassifer (#260)
    • MLPRegressor (#288)
    • Other Classifiers: (#260)
      • BernoulliNB
      • GaussianNB
      • MultinomialNB
    • SelectKBest chi2 support (#262)
    • SelectPercentile (#263)
    • SimpleImputer (#267)
    • PolynomialFeatures (#269)

    Other updates:

    • We now support xgboost>=0.90 (#253)
    • Added optimized_execution to torchscript backend (#276)
    • We now allow older version of scikit-learn for compatibility reasons (#274)

    Bug fixes:

    • Logistic Regression with the lbfgs option (#261)
    • Empty trees and multiclass (#265)
    • Fix isolation forest for sklearn <= 0.21 (#290)

    Credits

    A big thank you to our contributors: @scnakandala, @zhanjiezhu Extra big thanks to @scnakandala for all of the scikit-learn converters!

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.0.6.tar.gz(48.55 KB)
    hummingbird_ml-0.0.6-py2.py3-none-any.whl(76.07 KB)
  • v0.0.5(Aug 21, 2020)

    This release adds TorchScript as a backend, removes the problematic auto-installation of pytorch, improves syntax for the ONNX converter, adds notebook enhancements, and adds 3rd party library version upgrades. Hummingbird now provides the same Sklearn API across backends for executing inference.

    Announcing: Integration with TorchScript

    This release adds TorchScript as a backend (#225).

    Users can convert models with:

    hummingbird.ml.convert(model, "torchscript", X)
    

    Installation changes:

    After several reports from users across multiple platforms (in terms of both OS and underlying hardware), we changed the Hummingbird installer to require users to first install pytorch before installing Hummingbird (#246). This allows users to select the right pytorch version for a specific platform and should simplify the installation process and remove issues caused by having the wrong pytorch version installed.

    ONNX API

    For ONNX, we changed the API to have a more seamless experience, allowing users to interact with ONNX models in a consistent way with other models in Hummingbird (#231).

    Instead of the user having to instanciate the ONNX runtime session:

    -        session = ort.InferenceSession(onnx_model.SerializeToString())
    -        onnx_pred = session.run(output_names, inputs)
    

    The user can now just call predict, predict_proba, transform, etc. as with other Hummingbird conversions.

    +        onnx_pred = onnx_model.predict(X)
    

    New Operators

    • OneHotEncoder - integers (#197)
    • Support for Tweedie distribution in LGBM (#242)

    Miscellaneous

    • The target opset for ONNX is now 11 (#214)
    • The target pytorch version is now 1.6.0, except for with Python 3.5 it remains at 1.5.1 for compatibility reasons (#213)
    • Docs are now auto-generated (#223)

    Credits

    Thanks to @KranthiGV for the updated LGBM ONNX notebook example

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.5-py2.py3-none-any.whl(58.75 KB)
  • v0.0.4(Jul 21, 2020)

    This release adds several new operators to both scikit-learn and Onnx.

    New Features

    • float 64 support [#186]
    • Better windows installation support [#179]

    New Operators - scikit-learn

    • IsolationForest [#191]
    • LGBMRanker [#173]
    • OneHotEncoder [#193]
    • Scaler [#171]
    • XBGRanker [#189]

    New Operators - Onnx

    • ArrayFeatureExtractor [#198]
    • Linear Classifier/Regressor [#190, #194]
    • Normalizer [#188]
    • Scaler [#196]

    Credits

    This release would not have been possible without the following contributors: @ahmedkrmn, @KranthiGV, @TuanNguyen27, @zhanjiezhu

    Source code(tar.gz)
    Source code(zip)
    hummingbird-ml-0.0.4.tar.gz(35.51 KB)
    hummingbird_ml-0.0.4-py2.py3-none-any.whl(53.82 KB)
    hummingbird_ml-0.0.4-py2.py3-none-win32.whl(99.51 KB)
    hummingbird_ml-0.0.4-py2.py3-none-win_amd64.whl(99.51 KB)
  • v0.0.3(Jun 19, 2020)

    This release adds several new cool features and bug fixes to Hummingbird!

    API Changes

    When selecting the backend to use for conversion, we renamed pytorch into torch (to match the module name). [#142]

    New Operators

    • HistGradientBoostingRegressor [#135 ]
    • LinearRegression [#140 ]
    • LinearSVC [#140 ]
    • LogisticRegression [#140 ]
    • LogisticRegressionCV [#140 ]
    • Normalizer [#126]

    New Features

    • transform method is added to the PyTorch container to match the transformer API of Sklearn. [#148 ]
    • Support for ONNX models as input (at the moment this feature only works in combination with the lightgbm_converter in ONNXMLTOOLS) [#142 ]
    • Generation of ONNX models as output (at the moment this feature only works when a ONNX model is passed as input) [#142]

    Credits

    This release would not have been possible without the following contributors: @ahmedkrmn, @jspisak, and @TuanNguyen27.

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.3-py2.py3-none-any.whl(41.47 KB)
  • v0.0.2(Jun 10, 2020)

    This release adds several new operators, an updated API, and contains several documentation fixes.

    New Operators

    • DecisionTreeRegressor [#102 ]
    • ExtraTreesRegressor [#91 ]
    • GradientBoostingRegressor [#88 ]
    • HistGradientBoostingClassifier [#87]

    Credits

    Special thanks to following contributors: @KranthiGV (DecisionTreeRegressor), @mmbhatk (ExtraTreesRegressor), @bfgray3 (GradientBoostingRegressor), and @ahmedkrmn (HistGradientBoostingClassifier)

    Source code(tar.gz)
    Source code(zip)
    hummingbird_ml-0.0.2-py2.py3-none-any.whl(27.06 KB)
  • v0.0.1(May 7, 2020)

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
A universal framework for learning timestamp-level representations of time series

TS2Vec This repository contains the official implementation for the paper Learning Timestamp-Level Representations for Time Series with Hierarchical C

Zhihan Yue 284 Dec 30, 2022
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
Steer OpenAI's Jukebox with Music Taggers

TagBox Steer OpenAI's Jukebox with Music Taggers! The closest thing we have to VQGAN+CLIP for music! Unsupervised Source Separation By Steering Pretra

Ethan Manilow 34 Nov 02, 2022
Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks Novel and high-performance medical ima

14 Dec 18, 2022
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

78 Dec 27, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 02, 2023
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
Collection of generative models in Pytorch version.

pytorch-generative-model-collections Original : [Tensorflow version] Pytorch implementation of various GANs. This repository was re-implemented with r

Hyeonwoo Kang 2.4k Dec 31, 2022
Equivariant layers for RC-complement symmetry in DNA sequence data

Equi-RC Equivariant layers for RC-complement symmetry in DNA sequence data This is a repository that implements the layers as described in "Reverse-Co

7 May 19, 2022
This is a project based on ConvNets used to identify whether a road is clean or dirty. We have used MobileNet as our base architecture and the weights are based on imagenet.

PROJECT TITLE: CLEAN/DIRTY ROAD DETECTION USING TRANSFER LEARNING Description: This is a project based on ConvNets used to identify whether a road is

Faizal Karim 3 Nov 06, 2022
In this project, we'll be making our own screen recorder in Python using some libraries.

Screen Recorder in Python Project Description: In this project, we'll be making our own screen recorder in Python using some libraries. Requirements:

Hassan Shahzad 4 Jan 24, 2022
Make a surveillance camera from your raspberry pi!

rpi-surveillance Make a surveillance camera from your Raspberry Pi 4! The surveillance is built as following: the camera records 10 seconds video and

Vladyslav 62 Feb 03, 2022
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
Code for Transformer Hawkes Process, ICML 2020.

Transformer Hawkes Process Source code for Transformer Hawkes Process (ICML 2020). Run the code Dependencies Python 3.7. Anaconda contains all the req

Simiao Zuo 111 Dec 26, 2022
face property detection pytorch

This is the face property train code of project face-detection-project

i am x 2 Oct 18, 2021
Neural Network Libraries

Neural Network Libraries Neural Network Libraries is a deep learning framework that is intended to be used for research, development and production. W

Sony 2.6k Dec 30, 2022
rliable is an open-source Python library for reliable evaluation, even with a handful of runs, on reinforcement learning and machine learnings benchmarks.

Open-source library for reliable evaluation on reinforcement learning and machine learning benchmarks. See NeurIPS 2021 oral for details.

Google Research 529 Jan 01, 2023
Housing Price Prediction

This project aim was to predict the price of houses in the Boston area during the great financial crisis through regression, as well as classify houses into different quality categories according to

Florian Klement 1 Jan 27, 2022
A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".

Dual-Contrastive-Learning A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation". Y

hoshi-hiyouga 85 Dec 26, 2022
Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).

Zero-shot Natural Language Video Localization (ZSNLVL) by Pseudo-Supervised Video Localization (PSVL) This repository is for Zero-shot Natural Languag

Computer Vision Lab. @ GIST 37 Dec 27, 2022