Detecting silent model failure. NannyML estimates performance with an algorithm called Confidence-based Performance estimation (CBPE), developed by core contributors. It is the only open-source algorithm capable of fully capturing the impact of data drift on performance.

Overview

Documentation Status PyPI - License

NannyML - OSS Python library for detecting silent ML model failure | Product Hunt

Website โ€ข Docs โ€ข Community Slack

animated

๐Ÿ’ก What is NannyML?

NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, interactive visualizations, is completely model-agnostic and currently supports all tabular classification use cases.

The core contributors of NannyML have researched and developed a novel algorithm for estimating model performance: confidence-based performance estimation (CBPE). The nansters also invented a new approach to detect multivariate data drift using PCA-based data reconstruction.

If you like what we are working on, be sure to become a Nanster yourself, join our community slack and support us with a GitHub star โญ .

โ˜” Why use NannyML?

NannyML closes the loop with performance monitoring and post deployment data science, empowering data scientist to quickly understand and automatically detect silent model failure. By using NannyML, data scientists can finally maintain complete visibility and trust in their deployed machine learning models. Allowing you to have the following benefits:

  • End sleepless nights caused by not knowing your model performance ๐Ÿ˜ด
  • Analyse data drift and model performance over time
  • Discover the root cause to why your models are not performing as expected
  • No alert fatigue! React only when necessary if model performance is impacted
  • Painless setup in any environment

๐Ÿง  GO DEEP

NannyML Resources Description
โ˜Ž๏ธ NannyML 101 New to NannyML? Start here!
๐Ÿ”ฎ Performance estimation How the magic works.
๐ŸŒ Real world example Take a look at a real-world example of NannyML.
๐Ÿ”‘ Key concepts Glossary of key concepts we use.
๐Ÿ”ฌ Technical reference Monitor the performance of your ML models.
๐Ÿ”Ž Blog Thoughts on post-deployment data science from the NannyML team.
๐Ÿ“ฌ Newsletter All things post-deployment data science. Subscribe to see the latest papers and blogs.
๐Ÿ’Ž New in v0.4.0 New features, bug fixes.
๐Ÿง‘โ€๐Ÿ’ป Contribute How to contribute to the NannyML project and codebase.
Join slack Need help with your specific use case? Say hi on slack!

๐Ÿ”ฑ Features

1. Performance estimation and monitoring

When the actual outcome of your deployed prediction models is delayed, or even when post-deployment target labels are completely absent, you can use NannyML's CBPE-algorithm to estimate model performance. This algorithm requires the predicted probabilities of your machine learning model and leverages probability calibration to estimate any traditional binary classification metric (ROC_AUC, Precision, Recall, F1, etc.). Rather than estimating the performance of future model predictions, CBPE estimates the expected model performance of the predictions made at inference time.

NannyML can also track the realised performance of your machine learning model once targets are available.

2. Data drift detection

To detect multivariate feature drift NannyML uses PCA-based data reconstruction. Changes in the resulting reconstruction error are monitored over time and data drift alerts are logged when the reconstruction error in a certain period exceeds a threshold. This threshold is calculated based on the reconstruction error observed in the reference period.

NannyML utilises statistical tests to detect univariate feature drift. The Kolmogorovโ€“Smirnov test is used for continuous features and the 2-sample chi-squared test for categorical features. The results of these tests are tracked over time, properly corrected to counteract multiplicity and overlayed on the temporal feature distributions. (It is also possible to visualise the test-statistics over time, to get a notion of the drift magnitude.)

NannyML uses the same statistical tests to detected model output drift.

Target distribution drift is monitored by calculating the mean occurrence of positive events in combination with the 2-sample chi-squared test. Bear in mind that this operation requires the presence of actuals.

3. Intelligent alerting

Because NannyML can estimate performance, it is possible to weed out data drift alerts that do not impact expected performance, combatting alert fatigue. Besides linking data drift issues to drops in performance it is also possible to prioritise alerts according to other criteria using NannyML's Ranker.

๐Ÿš€ Getting started

Install NannyML

From PyPI:

pip install nannyml

Here be dragons! Use the latest development version of NannyML at your own risk:

python -m pip install git+https://github.com/NannyML/nannyml

Quick Start

The following snippet is based on our latest release.

import pandas as pd
import nannyml as nml

# Load dummy data
reference, analysis, analysis_target = nml.load_synthetic_binary_classification_dataset()
data = pd.concat([reference, analysis], ignore_index=True)

# Extract meta data
metadata = nml.extract_metadata(data = reference, model_type='classification_binary', exclude_columns=['identifier'])
metadata.target_column_name = 'work_home_actual'

# Choose a chunker or set a chunk size
chunk_size = 5000

# Estimate model performance
estimator = nml.CBPE(model_metadata=metadata, metrics=['roc_auc'], chunk_size=chunk_size)
estimator.fit(reference)
estimated_performance = estimator.estimate(data=data)

figure = estimated_performance.plot(metric='roc_auc', kind='performance')
figure.show()

# Detect multivariate feature drift
multivariate_calculator = nml.DataReconstructionDriftCalculator(model_metadata=metadata, chunk_size=chunk_size)
multivariate_calculator.fit(reference_data=reference)
multivariate_results = multivariate_calculator.calculate(data=data)

figure = multivariate_results.plot(kind='drift')
figure.show()

# Detect univariate feature drift
univariate_calculator = nml.UnivariateStatisticalDriftCalculator(model_metadata=metadata, chunk_size=chunk_size)
univariate_calculator.fit(reference_data=reference)
univariate_results = univariate_calculator.calculate(data=data)

# Rank features based on number of alerts
ranker = nml.Ranker.by('alert_count')
ranked_features = ranker.rank(univariate_results, model_metadata=metadata, only_drifting = False)

for feature in ranked_features.feature:
    figure = univariate_results.plot(kind='feature_distribution', feature_label=feature)
    figure.show()

๐Ÿ“– Documentation

๐Ÿฆธ Contributing and Community

We want to build NannyML together with the community! The easiest to contribute at the moment is to propose new features or log bugs under issues. For more information, have a look at how to contribute.

๐Ÿ™‹ Get help

The best place to ask for help is in the community slack. Feel free to join and ask questions or raise issues. Someone will definitely respond to you.

๐Ÿฅท Stay updated

If you want to stay up to date with recent changes to the NannyML library, you can subscribe to our release notes. For thoughts on post-deployment data science from the NannyML team, feel free to visit our blog. You can also sing up for our newsletter, which brings together the best papers, articles, news, and open-source libraries highlighting the ML challenges after deployment.

๐Ÿ“„ License

NannyML is distributed under an Apache License Version 2.0. A complete version can be found here. All contributions will be distributed under this license.

Comments
  • Can't load dataset

    Can't load dataset

    • nannyml version: 0.4.0
    • Python version: 3.8.0
    • Operating System: 5.10.102.1-microsoft-standard-WSL2 ; Ubuntu 18.04.6 LTS

    Description

    I'm trying to walk through the Quickstart guide and getting the following error: module 'nannyml' has no attribute 'load_synthetic_binary_classification_dataset'

    What I Did

    import pandas as pd
    import nannyml as nml
    from IPython.display import display
    reference, analysis, analysis_target =nml.load_synthetic_binary_classification_dataset()
    display(analysis.head())
    display(reference.head())
    
    opened by nlp-sid 10
  • Suggested changes to the documentation

    Suggested changes to the documentation

    Some minor corrections / suggestions for the documentation:

    (0) A general comment: as far as I can tell, nowhere in the documentation is it explicitly stated that the model not required to do the performance prediction (although it is implicit). Also, it's always assumed that the model is a machine learning one, but in theory the software is model-agnostic, as long as the model outputs conform to the expected formats, right?

    (1) https://docs.nannyml.com/latest/quick.html

    • replace it's with its (2 occurrences)
    • "This is why on the synthetic dataset it is provided in a separate object." --> "This is why in the synthetic dataset it is provided in a separate object."
    • Some words / phrases are capitalized in the middle of a sentence, seemingly at random. i,e, "Model Monitoring" or "Machine Learning".
    • In the first plot, I feel it would be important to also show the actual model performance (ROC AUC in this case). This is probably THE most crucial thing a potential user wants to see here: are the predictions correct?
    • Final sentence: "This drift is responsible for the potential negative impact in performance that we observed." The actual ROC AUC is never actually shown, so the reader has no idea what the change in performance really is (just has to trust it is true).

    (2) Glossary (https://docs.nannyml.com/latest/glossary.html#glossary)

    • In the entry for concept drift, it should probably be stated that the term is sometimes used (by others) with a broader definition that also includes things like label shift (?).
    • In the predicted scores entry, "calues" should be "values".
    • Univariate Drift Detection and Multivariate Drift Detection entries: "of our model" is superfluous and probably misleading in this context.
    • The entry for Model "Definition of a model." is a circular definition.
    • CBPE (Confidence-Based Perofmance Estimation): change "Perofmance" to "Performance"

    (3) https://docs.nannyml.com/latest/guides/data_drift.html

    • "instannce" --> "instance"
    • "consice" --> "concise"
    • The plot "Distribution over time for y_pred_proba" has the x and y axis labels swapped.
    • "occurance" --> "occurrence"
    • The section "Drift detection for model targets" seems to just end without much of a conclusion.

    (4) https://docs.nannyml.com/latest/guides/performance_estimation.html

    • The y-axis title on last plot is partially cropped; it would also be useful to change the x-axis to time, and add the analysis period line to make this plot easier to compare to the previous one that has the performance prediction.
    documentation 
    opened by humphrey-and-the-machine 9
  • Add bootstrapping options to chunk methods

    Add bootstrapping options to chunk methods

    It would be nice if you'd allow bootstrapping (resampling with replacement) instead of non-overlapping chunks for the CBPE estimate. Ideally, something like this: https://stats.stackexchange.com/questions/96739/what-is-the-632-rule-in-bootstrapping

    enhancement stale 
    opened by ai-noahdolev 8
  • Confidence bounds on CBPE plot go above 1.0 when ROC-AUC is 1.0

    Confidence bounds on CBPE plot go above 1.0 when ROC-AUC is 1.0

    • nannyml version: 0.2.0

    Description The confidence bounds of the CBPE plot go above 1.0 when ROC-AUC is one. They should be cut off at 1.0 as a ROC-AUC of over 1.0 is impossible.

    image

    bug good first issue 
    opened by hakimelakhrass 8
  • Pandas data type 'string' not understood

    Pandas data type 'string' not understood

    Describe the bug Running the Quickstart results in an error

    To Reproduce Steps to reproduce the behavior: Runing:

    import pandas as pd
    import nannyml as nml
    from IPython.display import display
    
    # Load synthetic data
    reference, analysis, analysis_target = nml.load_synthetic_binary_classification_dataset()
    display(reference.head())
    display(analysis.head())
    
    # Choose a chunker or set a chunk size
    chunk_size = 5000
    
    # initialize, specify required data columns, fit estimator and estimate
    estimator = nml.CBPE(
       y_pred_proba='y_pred_proba',
       y_pred='y_pred',
       y_true='work_home_actual',
       timestamp_column_name='timestamp',
       metrics=['roc_auc'],
       chunk_size=chunk_size,
    )
    estimator = estimator.fit(reference)
    estimated_performance = estimator.estimate(analysis)
    
    # Show results
    figure = estimated_performance.plot(kind='performance', metric='roc_auc', plot_reference=True)
    figure.show()
    
    # Define feature columns
    feature_column_names = [
        col for col in reference.columns if col not in [
            'timestamp', 'y_pred_proba', 'period', 'y_pred', 'work_home_actual', 'identifier'
        ]]
    
    # Let's initialize the object that will perform the Univariate Drift calculations
    univariate_calculator = nml.UnivariateStatisticalDriftCalculator(
        feature_column_names=feature_column_names,
        timestamp_column_name='timestamp',
        chunk_size=chunk_size
    )
    univariate_calculator = univariate_calculator.fit(reference)
    univariate_results = univariate_calculator.calculate(analysis)
    # Plot drift results for all model inputs
    for feature in univariate_calculator.feature_column_names:
        figure = univariate_results.plot(
            kind='feature_drift',
            metric='statistic',
            feature_column_name=feature,
            plot_reference=True
        )
        figure.show()
    
    # Rank features based on number of alerts
    ranker = nml.Ranker.by('alert_count')
    ranked_features = ranker.rank(univariate_results, only_drifting = False)
    display(ranked_features)
    
    calc = nml.StatisticalOutputDriftCalculator(
        y_pred='y_pred',
        y_pred_proba='y_pred_proba',
        timestamp_column_name='timestamp'
    )
    calc.fit(reference)
    results = calc.calculate(analysis)
    
    figure = results.plot(kind='prediction_drift', plot_reference=True)
    figure.show()
    
    # Let's initialize the object that will perform Data Reconstruction with PCA
    rcerror_calculator = nml.DataReconstructionDriftCalculator(feature_column_names=feature_column_names, timestamp_column_name='timestamp', chunk_size=chunk_size).fit(reference_data=reference)
    # let's see Reconstruction error statistics for all available data
    rcerror_results = rcerror_calculator.calculate(analysis)
    figure = rcerror_results.plot(kind='drift', plot_reference=True)
    figure.show()
    

    Gives the following error:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    ~\anaconda3\lib\site-packages\nannyml\base.py in fit(self, reference_data, *args, **kwargs)
         94             self._logger.debug(f"fitting {str(self)}")
    ---> 95             return self._fit(reference_data, *args, **kwargs)
         96         except InvalidArgumentsException:
    
    ~\anaconda3\lib\site-packages\nannyml\drift\model_inputs\univariate\statistical\calculator.py in _fit(self, reference_data, *args, **kwargs)
        105         self.previous_reference_data = reference_data.copy()
    --> 106         self.previous_reference_results = self._calculate(self.previous_reference_data).data
        107 
    
    ~\anaconda3\lib\site-packages\nannyml\drift\model_inputs\univariate\statistical\calculator.py in _calculate(self, data, *args, **kwargs)
        116 
    --> 117         self.continuous_column_names, self.categorical_column_names = _split_features_by_type(
        118             data, self.feature_column_names
    
    ~\anaconda3\lib\site-packages\nannyml\base.py in _split_features_by_type(data, feature_column_names)
        229 
    --> 230     categorical_column_names = [col for col in feature_column_names if _column_is_categorical(data[col])]
        231 
    
    ~\anaconda3\lib\site-packages\nannyml\base.py in <listcomp>(.0)
        229 
    --> 230     categorical_column_names = [col for col in feature_column_names if _column_is_categorical(data[col])]
        231 
    
    ~\anaconda3\lib\site-packages\nannyml\base.py in _column_is_categorical(column)
        235 def _column_is_categorical(column: pd.Series) -> bool:
    --> 236     return column.dtype in ['object', 'string', 'category', 'bool']
        237 
    
    TypeError: data type 'string' not understood
    
    During handling of the above exception, another exception occurred:
    
    CalculatorException                       Traceback (most recent call last)
    <ipython-input-1-9ae82d7fa4d4> in <module>
         39     chunk_size=chunk_size
         40 )
    ---> 41 univariate_calculator = univariate_calculator.fit(reference)
         42 univariate_results = univariate_calculator.calculate(analysis)
         43 # Plot drift results for all model inputs
    
    ~\anaconda3\lib\site-packages\nannyml\base.py in fit(self, reference_data, *args, **kwargs)
         99             raise
        100         except Exception as exc:
    --> 101             raise CalculatorException(f"failed while fitting {str(self)}.\n{exc}")
        102 
        103     def calculate(self, data: pd.DataFrame, *args, **kwargs) -> AbstractCalculatorResult:
    
    CalculatorException: failed while fitting <nannyml.drift.model_inputs.univariate.statistical.calculator.UnivariateStatisticalDriftCalculator object at 0x0000022BBF196A30>.
    data type 'string' not understood
    

    Expected behavior The quickstart code runs without a problem.

    Additional context

    The user who had that issue was running python 3.8 on windows through a pycharm environment.

    I couldn't reproduce the error when I tried on my machine. Moreover when I guided the user to set up a new conda environment the error went away.

    However maybe the way string type is defined here could be changed similar to suggestions such as these to cover more cases? I 'd hold of on that until we see more users having the issue, since in this case a misconfigured environment is more likely the problem than a library compatibility issue.

    bug stale 
    opened by nikml 6
  • Example plots not showing

    Example plots not showing

    Description

    I was trying to reproduce the example on the main readme. Plots are not showing. See below:

    image

    image

    I tried this with both versions 0.4.0/0.4.1

    If it helps, I'm using:

    jupyterlab==3.4.2
    pandas==1.4.1
    

    What I Did

    Example on readme, as is:

    import pandas as pd
    import nannyml as nml
    
    # Load synthetic data
    reference, analysis, analysis_target = nml.load_synthetic_binary_classification_dataset()
    data = pd.concat([reference, analysis], ignore_index=True)
    
    # Extract meta data
    metadata = nml.extract_metadata(data = reference, model_name='wfh_predictor', model_type='classification_binary', exclude_columns=['identifier'])
    metadata.target_column_name = 'work_home_actual'
    
    # Choose a chunker or set a chunk size
    chunk_size = 5000
    
    # Estimate model performance
    estimator = nml.CBPE(model_metadata=metadata, metrics=['roc_auc'], chunk_size=chunk_size)
    estimator.fit(reference)
    estimated_performance = estimator.estimate(data=data)
    
    figure = estimated_performance.plot(metric='roc_auc', kind='performance')
    figure.show()
    
    # Detect multivariate feature drift
    multivariate_calculator = nml.DataReconstructionDriftCalculator(model_metadata=metadata, chunk_size=chunk_size)
    multivariate_calculator.fit(reference_data=reference)
    multivariate_results = multivariate_calculator.calculate(data=data)
    
    figure = multivariate_results.plot(kind='drift')
    figure.show()
    
    # Detect univariate feature drift
    univariate_calculator = nml.UnivariateStatisticalDriftCalculator(model_metadata=metadata, chunk_size=chunk_size)
    univariate_calculator.fit(reference_data=reference)
    univariate_results = univariate_calculator.calculate(data=data)
    
    # Rank features based on number of alerts
    ranker = nml.Ranker.by('alert_count')
    ranked_features = ranker.rank(univariate_results, model_metadata=metadata, only_drifting = False)
    
    for feature in ranked_features.feature:
        figure = univariate_results.plot(kind='feature_distribution', feature_label=feature)
        figure.show()
    

    Thank you!

    opened by IgnacioPascale 6
  • extra dependencies not being installed

    extra dependencies not being installed

    Hi!

    When following the instructions and running poetry install -E test -E doc -E dev, extra packages such as tox are not installed. I think this is related to closed issue 67.

    This is related to this Poetry behaviour. Extra packages should come from dependencies and not dev-dependencies, otherwise they are not installed.

    I have tried to move the dev-dependencies packages within the dependencies section in pyproject.toml and after having updated the poetry.lock it works fine.

    Do you want me to send a PR with this? I you think it would be better to somewhat separate "dev" dependencies from other dependencies, an option would be to use Poetry dependency groups with something such as a [tool.poetry.group.test.dependencies] section. I can do that in the PR if you think it is better.

    question 
    opened by rfrenoy 6
  • Update univariate comparison

    Update univariate comparison

    • Changed confidence intervals to 95% instead of 99.99%
    • Changed figure sizes and font sizes to make the figures more readable
    • Added plots that zoom into the behaviour of methods on smaller shifts for the shifting mean and shifting SD experiments
    • Split apart uniform distribution in the categorical section
    opened by cartgr 4
  • Running nml.CBPE getting typeerror

    Running nml.CBPE getting typeerror

    I got TypeError new() missing 1 required positional argument: 'model_metadata' when running this chunk of the codes:

    estimator = nml.CBPE( y_pred_proba='y_pred_proba', y_pred='y_pred', y_true='y', timestamp_column_name='timestamp', metrics=['roc_auc', 'f1'], chunk_size=5000, problem_type='classification_binary', )

    nannyML version is 0.4.1

    What does the error describe here? Thanks for help

    opened by langmusi 4
  • nannyml can confuse months with days on some rows of a dataset

    nannyml can confuse months with days on some rows of a dataset

    Describe the bug NannyML gets the month and the date values wrong and can make a date of 01-09-2018 (1st September 2018) be thought of 09-01-2018 (9th January 2018).

    To Reproduce Install NannyML and run the following code in a jupyter notebook:

    import wget
    from pathlib import Path
    import pandas as pd
    import nannyml as nml
    
    
    url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00618/Steel_industry_data.csv'
    download_foler = Path.home().joinpath("Downloads")
    filename = wget.download(url,out = str(download_foler))
    
    data = pd.read_csv(filename, header=0)
    display(data.head())
    
    features_selected = list(data.columns)[2:]
    
    data['partition'] = 'train'
    ind_ = data.shape[0]//3
    data.loc[ind_:2*ind_, 'partition'] = 'reference'
    data.loc[2*ind_: , 'partition'] = 'analysis'
    
    reference = data.loc[data.partition == 'reference', :].reset_index(drop=True)
    analysis = data.loc[data.partition == 'analysis', :].reset_index(drop=True)
    
    calc = nml.UnivariateStatisticalDriftCalculator(
        feature_column_names=[features_selected[0]],
        timestamp_column_name='date'
    )
    calc.fit(reference)
    results = calc.calculate(analysis)
    
    drift_fig = results.plot(kind='feature_drift', feature_column_name=features_selected[0], plot_reference=True)
    drift_fig.show()
    
    display(results.data)
    display(analysis)
    
    

    A partial screenshot of the results.data can be seen below:

    Screenshot 2022-08-30 at 15-28-31 NP18 UCI Steโ€ฆ (2) - JupyterLab

    A partial screensgot of the first and last line of the analysis dataframe can be seen below:

    Screenshot 2022-08-30 at 15-27-04 NP18 UCI Steโ€ฆ (2) - JupyterLab

    By comparing the two screenshots you can see that some dates have been altered.

    Expected behavior The dates would be recognized correctly. The misidentification of dates can be seen by comparing the start and end dates of the results data and the start and end dates of the analysis dataset.

    Screenshots & scripts The problem is also visible on the following univariate drift plot:

    newplot(24)

    stale 
    opened by nikml 4
  • The example from Quick Start does not work

    The example from Quick Start does not work

    • nannyml version: 0.3.2
    • Python version: 3.10.4
    • Operating System: macos 12.0.1

    Hello

    1. reference, analysis, analysis_target = nml.load_synthetic_binary_classification_dataset() does not work. error msg: AttributeError: module 'nannyml' has no attribute 'load_synthetic_binary_classification_dataset' Looks like instead the method load_synthetic_sample() must be used

    2. metadata = nml.extract_metadata(data = reference, model_name='wfh_predictor', model_type='classification_binary', exclude_columns=['identifier']) also does not work. error msg: TypeError: extract_metadata() got an unexpected keyword argument 'model_type'

    opened by cat-zeppelin 4
  • Improve readability of sequence checking conditionals

    Improve readability of sequence checking conditionals

    Issue here

    Summary

    • Improve readability of sequence checking conditionals by making it simpler

    Testing Approach

    • Since this is basically a refactor, I just ran poetry run tox to check if there we're any regressions. image

    Note

    Trying to make contributing to open source stick. Feel free to disregard.

    opened by jrggementiza 1
  • Improve readability of sequence checking conditionals

    Improve readability of sequence checking conditionals

    Motivation

    • Was poking around the codebase when I noticed sequence checking could use improvement in terms of code readability

    Solution

    Recommended Changes

    • We can simplify the following code fragments:
    - if len(some_sequence) > 0:
    + if some_sequence:
          do_something()
    
    - if len(some_other_sequence) == 0:
    + if not some_other_sequence:
          do_something_else()
    

    Additional Context

    Note

    Trying to make contributing to open source stick. Feel free to disregard.

    enhancement 
    opened by jrggementiza 0
  • Use np.histogram_bin_edges to compute bin edges for ECE

    Use np.histogram_bin_edges to compute bin edges for ECE

    What is this PR for ?

    This PR is intended at improving the binning used in the calibration step by using np.histogram_bin_edges instead of the current _get_bin_index_edges function.

    It also adds a new calibration_bin_count parameter to CBPE to be able to tune the number of bins to use in needs_calibration function.

    Please note that the default values for bin_count are still 10. However it would probably better to switch to auto or fd as it would be generally better.

    How was it tested?

    New tests added in tests/test_calibration.py to validate the new behaviour. poetry run tox is passing.

    opened by Jebq 2
  • NannyML (quicstart code) silently fails to create a univariate drift plot when too many features are selected

    NannyML (quicstart code) silently fails to create a univariate drift plot when too many features are selected

    Describe the bug

    When we have MANY features into univariate drift and use code trying to plot them all it fails silently. The machine still computes but does, presumably?, nothing.

    To Reproduce

    Posting example code since a full reproducible example relies on internal compute resources.

    
    # load data and create reference and analysis dataframes
    # in this cases fabert openml dataset
    
    chunker = nml.SizeBasedChunker(chunk_size=_suggested_chunk_size)
    
    
    univariate_calculator = nml.UnivariateDriftCalculator(
        column_names=feature_column_names,
        continuous_methods=['jensen_shannon'],
        categorical_methods=['jensen_shannon'],
        chunker=chunker,
    )
    univariate_calculator = univariate_calculator.fit(reference)
    univariate_results = univariate_calculator.calculate(analysis)
    

    In our case the selected features number is 800. Trying to create one plot with:

    figure2 = univariate_results.filter(
        period='all',
        column_names=univariate_results.column_names, 
        methods=['jensen_shannon']).plot(kind='drift')
    

    will fail and the Jupyterlab notebook kept running doing, presumably, nothing.

    Expected behavior A plot would be created.

    Additional context If instead we try to create plots one by one, things work fine:

    for ftr in univariate_results.column_names:
        _fgr = univariate_results.filter(
            period='all',
            column_names=[ftr], 
            methods=['jensen_shannon']
        ).plot(kind='drift')
        _fgr.write_image(f"{_figure_folder}/drift-{ftr}.svg")
    
    bug 
    opened by nikml 1
  • Inappropriate Error when a string is given for drift methods

    Inappropriate Error when a string is given for drift methods

    # missing code that creates relevant dataframes
    feature_column_names = train.drop(['fraud_reported', 'timestamp'],axis = 1).columns
    
    univariate_calculator = nml.UnivariateDriftCalculator(
        column_names=list(feature_column_names),
        continuous_methods='jensen_shannon')
    
    univariate_calculator = univariate_calculator.fit(test)
    univariate_results = univariate_calculator.calculate(val)
    
    ---------------------------------------------------------------------------
    InvalidArgumentsException                 Traceback (most recent call last)
    Cell In[23], line 7
          1 feature_column_names = train.drop(['fraud_reported', 'timestamp'],axis = 1).columns
          3 univariate_calculator = nml.UnivariateDriftCalculator(
          4     column_names=list(feature_column_names),
          5     continuous_methods='kolmogorov_smirnov')
    ----> 7 univariate_calculator = univariate_calculator.fit(test)
          8 univariate_results = univariate_calculator.calculate(val)
         10 # for column_name in univariate_calculator.continuous_column_names:
         11 #     figure = univariate_results.plot(
         12 #         kind='drift',
       (...)
         16 #     )
         17 #     figure.show()
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/base.py:138, in AbstractCalculator.fit(self, reference_data, *args, **kwargs)
        136 try:
        137     self._logger.debug(f"fitting {str(self)}")
    --> 138     return self._fit(reference_data, *args, **kwargs)
        139 except InvalidArgumentsException:
        140     raise
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/usage_logging.py:189, in log_usage.<locals>.logging_decorator.<locals>.logging_wrapper(*args, **kwargs)
        187 finally:
        188     if runtime_exception is not None:
    --> 189         raise runtime_exception
        190     else:
        191         return res
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/usage_logging.py:142, in log_usage.<locals>.logging_decorator.<locals>.logging_wrapper(*args, **kwargs)
        139 runtime_exception, res = None, None
        140 try:
        141     # run original function
    --> 142     res = func(*args, **kwargs)
        143 except BaseException as exc:
        144     runtime_exception = exc
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/drift/univariate/calculator.py:109, in UnivariateDriftCalculator._fit(self, reference_data, *args, **kwargs)
        104 self.continuous_column_names, self.categorical_column_names = _split_features_by_type(
        105     reference_data, self.column_names
        106 )
        108 for column_name in self.continuous_column_names:
    --> 109     self._column_to_models_mapping[column_name] += [
        110         MethodFactory.create(key=method, feature_type=FeatureType.CONTINUOUS, chunker=self.chunker).fit(
        111             reference_data[column_name]
        112         )
        113         for method in self.continuous_method_names
        114     ]
        116 for column_name in self.categorical_column_names:
        117     self._column_to_models_mapping[column_name] += [
        118         MethodFactory.create(key=method, feature_type=FeatureType.CATEGORICAL, chunker=self.chunker).fit(
        119             reference_data[column_name]
        120         )
        121         for method in self.categorical_method_names
        122     ]
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/drift/univariate/calculator.py:110, in <listcomp>(.0)
        104 self.continuous_column_names, self.categorical_column_names = _split_features_by_type(
        105     reference_data, self.column_names
        106 )
        108 for column_name in self.continuous_column_names:
        109     self._column_to_models_mapping[column_name] += [
    --> 110         MethodFactory.create(key=method, feature_type=FeatureType.CONTINUOUS, chunker=self.chunker).fit(
        111             reference_data[column_name]
        112         )
        113         for method in self.continuous_method_names
        114     ]
        116 for column_name in self.categorical_column_names:
        117     self._column_to_models_mapping[column_name] += [
        118         MethodFactory.create(key=method, feature_type=FeatureType.CATEGORICAL, chunker=self.chunker).fit(
        119             reference_data[column_name]
        120         )
        121         for method in self.categorical_method_names
        122     ]
    
    File ~/.conda/envs/nanny/lib/python3.10/site-packages/nannyml/drift/univariate/methods.py:152, in MethodFactory.create(cls, key, feature_type, **kwargs)
        149     raise InvalidArgumentsException(f"cannot create method given a '{type(key)}'. Please provide a string.")
        151 if key not in cls.registry:
    --> 152     raise InvalidArgumentsException(
        153         f"unknown method key '{key}' given. "
        154         "Should be one of ['kolmogorov_smirnov', 'jensen_shannon', 'wasserstein', 'chi2', "
        155         "'jensen_shannon', 'l_infinity', 'hellinger']."
        156     )
        158 if feature_type not in cls.registry[key]:
        159     raise InvalidArgumentsException(f"method {key} does not support {feature_type.value} features.")
    
    InvalidArgumentsException: unknown method key 'k' given. Should be one of ['kolmogorov_smirnov', 'jensen_shannon', 'wasserstein', 'chi2', 'jensen_shannon', 'l_infinity', 'hellinger'].
    
    

    The problem seems to be that the user incorrectly gave a string instead of a list of strings. However instead of getting a proper error that he should give a list, he gets an error of unknown method k.

    I m assuming (because I didn't dig into the code) the string is turned into a list and then the first letter is attempted to be matched against our string options.

    P.S. Alternatively we can also accept a string here and handle it appropriately. Nice to have but less important than an error that correctly points out what went wrong.

    bug 
    opened by nikml 1
Releases(v0.8.1)
  • v0.8.1(Dec 1, 2022)

    Changed

    • Thorough refactor of the nannyml.drift.ranker module. The abstract base class and factory have been dropped in favor of a more flexible approach.
    • Thorough refactor of our Plotly-based plotting modules. These have been rewritten from scratch to make them more modular and composable. This will allow us to deliver more powerful and meaningful visualizations faster.

    Added

    • Added a new univariate drift method. The Hellinger distance, used for continuous variables.
    • Added an extensive write-up on when to use which univariate drift method.
    • Added a new way to rank the results of univariate drift calculation. The CorrelationRanker ranks columns based on the correlation between the drift value and the change in realized or estimated performance. Read all about it in the ranking documentation

    Fixed

    • Disabled usage logging for or GitHub workflows
    • Allow passing a single string to the metrics parameter of the result.filter() function, as per special request.
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.8.1-py3-none-any.whl(14.77 MB)
  • v0.8.0(Nov 24, 2022)

    Changed

    • Updated mypy to a new version, immediately resulting in some new checks that failed.

    Added

    • Added new univariate drift methods. The Wasserstein distance for continuous variables, and the L-Infinity distance for categorical variables.
    • Added usage logging to our key functions. Check out the docs to find out more on what, why, how, and how to disable it if you want to.

    Fixed

    • Fixed and updated various parts of the docs, reported at warp speed! Thanks @NeoKish!
    • Fixed mypy issues concerning 'implicit optionals'.
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.8.0-py3-none-any.whl(14.77 MB)
  • v0.7.0(Nov 7, 2022)

    Changed

    • Updated the handling of "leftover" observations when using the SizeBasedChunker and CountBasedChunker. Renamed the parameter for tweaking that behavior to incomplete, that can be set to keep, drop or append. Default behavior for both is now to append leftover observations to the last full chunk.
    • Refactored the nannyml.drift module. The intermediate structural level (model_inputs, model_outputs, targets) has been removed and turned into a single unified UnivariateDriftCalculator. The old built-in statistics have been re-implemented as Methods, allowing us to add new methods to detect univariate drift.
    • Simplified a lot of the codebase (but also complicated some bits) by storing results internally as multilevel-indexed DataFrames. This means we no longer have to 'convey information' by encoding data column names and method names in the names of result columns. We've introduced a new paradigm to deal with results. Drill down to the data you really need by using the filter method, which returns a new Result instance, with a smaller 'scope'. Then turn this Result into a DataFrame using the to_df method.
    • Changed the structure of the pyproject.toml file due to a Poetry upgrade to version 1.2.1.

    Added

    • Expanded the nannyml.io module with new Writer implementations: DatabaseWriter that exports data into multiple tables in a relational database and the PickleFileWriter which stores the pickled Results on local/remote/cloud disk.
    • Added a new univariate drift detection method based on the Jensen-Shannon distance. Used within the UnivariateDriftCalculator.

    Fixed

    • Added lightgbm installation instructions to our installation guide.
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.7.0-py3-none-any.whl(14.77 MB)
  • v0.6.3(Sep 22, 2022)

    Changed

    • dependencybot dependency updates
    • stalebot setup

    Fixed

    • CBPE now uses uncalibrated y_pred_proba values to calculate realized performance. Fixed for both binary and multiclass use cases (#98)
    • Fix an issue where reference data was rendered incorrectly on joy plots
    • Updated the 'California Housing' example docs, thanks for the help @NeoKish
    • Fix lower confidence bounds and thresholds under zero for regression cases. When the lower limit is set to 0, the lower threshold will not be plotted. (#127)
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.6.3-py3-none-any.whl(14.76 MB)
  • v0.6.2(Sep 16, 2022)

    Changed

    • Made the timestamp_column_name required by all calculators and estimators optional. The main consequences of this are plots have a chunk-index based x-axis now when no timestamp column name was given. You can also not chunk by period when the timestamp column name is not specified.

    Fixed

    • Added missing s3fs dependency
    • Fixed outdated plotting kind constants in the runner (used by CLI)
    • Fixed some missing images and incorrect version numbers in the README, thanks @NeoKish!

    Added

    • Added a lot of additional tests, mainly concerning plotting and the Runner class
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.6.2-py3-none-any.whl(14.76 MB)
  • v0.6.1(Sep 9, 2022)

    Changed

    • Use the problem_type parameter to determine the correct graph to output when plotting model output drift

    Fixed

    • Showing the wrong plot title for DLE estimation result plots, thanks @NeoKish
    • Fixed incorrect plot kinds in some error feedback for the model output drift calculator
    • Fixed missing problem_type argument in the Quickstart guide
    • Fix incorrect visualization of confidence bands on reference data in DLE and CBPE result plots
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.6.1-py3-none-any.whl(14.76 MB)
  • v0.6.0(Sep 8, 2022)

    Added

    • Added support for regression problems across all calculators and estimators. In some cases a required problem_type parameter is required during calculator/estimator initialization, this is a breaking change. Read more about using regression in our tutorials and about our new performance estimation for regression using the Direct Loss Estimation (DLE) algorithm.

    Changed

    • Improved tox running speed by skipping some unnecessary package installations. Thanks @baskervilski!

    Fixed

    • Fixed an issue where some Pandas column datatypes were not recognized as continuous by NannyML, causing them to be dropped in calculations. Thanks for reporting @Dbhasin1!
    • Fixed an issue where some helper columns for visualization crept into the stored reference results. Good catch @Dbhasin1!
    • Fixed an issue where a Reader instance would raise a WriteException. Thanks for those eagle eyes @baskervilski!
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.6.0-py3-none-any.whl(14.76 MB)
  • v0.5.3(Aug 30, 2022)

    Changed

    • We've completely overhauled the way we determine the "stability" of our estimations. We've moved on from determining a minimum Chunk size to estimating the sampling error for an operation on a Chunk.
      • A sampling error value will be provided per metric per Chunk in the result data for reconstruction error multivariate drift calculator, all performance calculation metrics and all performance estimation metrics.
      • Confidence bounds are now also based on this sampling error and will display a range around an estimation +/- 3 times the sampling error in CBPE and reconstruction error multivariate drift calculator. Be sure to check out our in-depth documentation on how it works or dive right into the implementation.

    Fixed

    • Fixed issue where an outdated version of Numpy caused Pandas to fail reading string columns in some scenarios (#93). Thank you, @Bernhard and @Gabriel for the investigative work!
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.5.3-py3-none-any.whl(12.42 MB)
  • v0.5.2(Aug 17, 2022)

  • v0.5.1(Aug 16, 2022)

    Added

    • Added simple CLI implementation to support automation and MLOps toolchain use cases. Supports reading/writing to cloud storage using S3, GCS, ADL, ABFS and AZ protocols. Containerized version available at dockerhub.

    Changed

    • make clean now also clears __pycache__
    • Fixed some inconsistencies in docstrings (they still need some additional love though)
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.5.1-py3-none-any.whl(12.42 MB)
  • v0.5.0(Jul 7, 2022)

  • v0.4.1(May 19, 2022)

    Added

    • Added limited support for regression use cases: create or extract RegressionMetadata and use it for drift detection. Performance estimation and calculation require more research.

    Changed

    • DefaultChunker splits into 10 chunks of equal size.
    • SizeBasedChunker no longer drops incomplete last chunk by default, but this is now configurable behavior.
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.4.1-py3-none-any.whl(9.10 MB)
  • v0.4.0(May 13, 2022)

    Added

    • Added support for new metrics in the Confidence Based Performance Estimator (CBPE). It now estimates roc_auc, f1, precision, recall, specificity and accuracy.
    • Added support for multiclass classification. This includes
      • Specifying multiclass classification metadata + support in automated metadata extraction (by introducing a model_type parameter).
      • Support for all CBPE metrics.
      • Support for realized performance calculation using the PerformanceCalculator.
      • Support for all types of drift detection (model inputs, model output, target distribution).
      • A new synthetic toy dataset.

    Changed

    • Removed the identifier property from the ModelMetadata class. Joining analysis data and analysis target values should be done upfront or index-based.
    • Added an exclude_columns parameter to the extract_metadata function. Use it to specify the columns that should not be considered as model metadata or features.
    • All fit methods now return the fitted object. This allows chaining Calculator/Estimator instantiation and fitting into a single line.
    • Custom metrics are no longer supported in the PerformanceCalculator. Only the predefined metrics remain supported.
    • Big documentation revamp: we've tweaked overall structure, page structure and incorporated lots of feedback.
    • Improvements to consistency and readability for the 'hover' visualization in the step plots, including consistent color usage, conditional formatting, icon usage etc.
    • Improved indication of "realized" and "estimated" performance in all CBPE step plots (changes to hover, axes and legends)

    Fixed

    • Updated homepage in project metadata
    • Added missing metadata modification to the quickstart
    • Perform some additional check on reference data during preprocessing
    • Various documentation suggestions (#58)
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.4.0-py3-none-any.whl(9.10 MB)
  • v0.3.2(May 3, 2022)

  • v0.3.0(Apr 8, 2022)

    Added

    • Added support for both predicted labels and predicted probabilities in ModelMetadata.
    • Support for monitoring model performance metrics using the PerformanceCalculator.
    • Support for monitoring target distribution using the TargetDistributionCalculator

    Changed

    • Plotting will default to using step plots.
    • Restructured the nannyml.drift package and subpackages. Breaking changes!
    • Metadata completeness check will now fail when there are features of FeatureType.UNKNOWN.
    • Chunk date boundaries are now calculated differently for a PeriodBasedChunker, using the theoretical period for boundaries as opposed to the observed boundaries within the chunk observations.
    • Updated version of the black pre-commit hook due to breaking changes in its click dependency.
    • The minimum chunk size will now be provided by each individual calculator / estimator / metric, allowing for each of them to warn the end user when chunk sizes are suboptimal.

    Fixed

    • Restrict version of the scipy dependency to be >=1.7.3, <1.8.0. Planned to be relaxed ASAP.
    • Deal with missing values in chunks causing NaN values when concatenating.
    • Crash when estimating CBPE without a target column present
    • Incorrect label in ModelMetadata printout
    Source code(tar.gz)
    Source code(zip)
    nannyml-0.3.0-py3-none-any.whl(5.48 MB)
Easy OpenAPI specs and Swagger UI for your Flask API

Flasgger Easy Swagger UI for your Flask API Flasgger is a Flask extension to extract OpenAPI-Specification from all Flask views registered in your API

Flasgger 3.1k Jan 05, 2023
Dynamic Resume Generator

Dynamic Resume Generator

Quinten Lisowe 15 May 19, 2022
epub2sphinx is a tool to convert epub files to ReST for Sphinx

epub2sphinx epub2sphinx is a tool to convert epub files to ReST for Sphinx. It uses Pandoc for converting HTML data inside epub files into ReST. It cr

Nihaal 8 Dec 15, 2022
DataRisk Detection Learning Resources

DataRisk Detection Learning Resources Data security: Based on the "data-centric security system" position, it generally refers to the entire security

Liao Wenzhe 59 Dec 05, 2022
The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Read the Docs 7.4k Dec 25, 2022
PySpark Cheat Sheet - learn PySpark and develop apps faster

This cheat sheet will help you learn PySpark and write PySpark apps faster. Everything in here is fully functional PySpark code you can run or adapt to your programs.

Carter Shanklin 168 Jan 01, 2023
Template repo to quickly make a tested and documented GitHub action in Python with Poetry

Python + Poetry GitHub Action Template Getting started from the template Rename the src/action_python_poetry package. Globally replace instances of ac

Kevin Duff 89 Dec 25, 2022
This repo contains everything you'll ever need to learn/revise python basics

Python Notes/cheat sheet Simplified notes to get your Python basics right Just compare code and output side by side and feel the rush of enlightenment

Hem 5 Oct 06, 2022
Yu-Gi-Oh! Master Duel translation script

Yu-Gi-Oh! Master Duel translation script

715 Jan 08, 2023
Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

datamodel-code-generator This code generator creates pydantic model from an openapi file and others. Help See documentation for more details. Supporte

Koudai Aono 1.3k Dec 29, 2022
Resource hub for Obsidian resources.

Obsidian Community Vault Welcome! This is an experimental vault that is maintained by the Obsidian community. For best results we recommend downloadin

Obsidian Community 320 Jan 02, 2023
A curated list of awesome tools for Sphinx Python Documentation Generator

Awesome Sphinx (Python Documentation Generator) A curated list of awesome extra libraries, software and resources for Sphinx (Python Documentation Gen

Hyunjun Kim 831 Dec 27, 2022
The project that powers MDN.

Kuma Kuma is the platform that powers MDN (developer.mozilla.org) Development Code: https://github.com/mdn/kuma Issues: P1 Bugs (to be fixed ASAP) P2

MDN Web Docs 1.9k Dec 26, 2022
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

time-series-kafka-demo Mock stream producer for time series data using Kafka. I walk through this tutorial and others here on GitHub and on my Medium

Maria Patterson 26 Nov 15, 2022
A collection of online resources to help you on your Tech journey.

Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di

Mohamed A 396 Dec 31, 2022
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

Academic.io 20.3k Jan 09, 2023
A website for courses of Major Computer Science, NKU

A website for courses of Major Computer Science, NKU

Sakura 0 Oct 06, 2022
swagger-codegen contains a template-driven engine to generate documentation, API clients and server stubs in different languages by parsing your OpenAPI / Swagger definition.

Master (2.4.25-SNAPSHOT): 3.0.31-SNAPSHOT: Maven Central โญ โญ โญ If you would like to contribute, please refer to guidelines and a list of open tasks. โญ

Swagger 15.2k Dec 31, 2022
Feature Store for Machine Learning

Overview Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and

Feast 3.8k Dec 30, 2022