Bias and Fairness Audit Toolkit

Overview

The Bias and Fairness Audit Toolkit

Aequitas is an open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive tools.

Visit the Aequitas project website

Try out the Aequitas web application

Try out our interact colab notebook using the COMPAS dataset.

Documentation

You can find the toolkit documentation here.

For usage examples of the python library, see our demo notebook from the KDD 2020 hands-on tutorial. Alternatively, have a look to COMPAS notebook using Aequitas on the ProPublica COMPAS Recidivism Risk Assessment dataset.

Installation

Aequitas is compatible with: Python 3.6+

Install Aequitas using pip:

pip install aequitas

If pip fails, try installing master from source:

git clone https://github.com/dssg/aequitas.git
cd aequitas
python setup.py install

(Note: be mindful of the python version you use to run setup.py)

You may then import the aequitas module from Python:

import aequitas

...or execute the auditor from the command line:

aequitas-report

...or launch the Web front-end from the command line (localhost):

python -m serve

Containerization

To build a Docker container of Aequitas:

docker build -t aequitas .

...or simply via manage:

manage container build

The Docker image's container defaults to launching the development Web server, though this can be overridden via the Docker "command" and/or "entrypoint".

To run such a container, supporting the Web server, on-the-fly:

docker run -p 5000:5000 -e "HOST=0.0.0.0" aequitas

...or, manage a development container via manage:

manage container [create|start|stop]

To contact the team, please email us at [aequitas at uchicago dot edu]

Aequitas Group Metrics

Below are descriptions of the absolute bias metrics calculated by Aequitas.

Metric Formula Description
Predicted Positive The number of entities within a group where the decision is positive, i.e.,
Total Predictive Positive The total number of entities predicted positive across groups defined by
Predicted Negative The number of entities within a group which decision is negative, i.e.,
Predicted Prevalence The fraction of entities within a group which were predicted as positive.
Predicted Positive Rate The fraction of the entities predicted as positive that belong to a certain group.
False Positive The number of entities of the group with and
False Negative The number of entities of the group with and
True Positive The number of entities of the group with and
True Negative The number of entities of the group with and
False Discovery Rate The fraction of false positives of a group within the predicted positive of the group.
False Omission Rate The fraction of false negatives of a group within the predicted negative of the group.
False Positive Rate The fraction of false positives of a group within the labeled negative of the group.
False Negative Rate The fraction of false negatives of a group within the labeled positives of the group.

Each bias disparity for a given group is calculated as follows:

30 Seconds to Aequitas

Python API

Detailed instructions are here.

To get started, preprocess your input data. Input data has slightly different requirements depending on whether you are using Aequitas via the webapp, CLI or Python package. See general input requirements and specific requirements for the web app, CLI, and Python API in the section immediately below.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string,' so don't forget to recast any 'categorical' type columns!

    from aequitas.preprocessing import preprocess_input_df
    
    # double-check that categorical columns are of type 'string'
    df['categorical_column_name'] = df['categorical_column_name'].astype(str)
    
    df, _ = preprocess_input_df(*input_data*)

The Aequitas Group() class creates a crosstab of your preprocessed data, calculating absolute group metrics from score and label value truth status (true/ false positives and true/ false negatives)

    from aequitas.group import Group
    
    g = Group()
    xtab, _ = g.get_crosstabs(df)

The Plot() class can visualize a single group metric with plot_group_metric(), or a list of bias metrics with plot_group_metric_all(). Suppose you are interested in False Positive Rate across groups. We can visualize this metric in Aequitas:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr')

There are some very small groups in this data set, for example 18 and 32 samples in the Native American and Asian population groups, respectively.

Aequitas includes an option to filter out groups under a minimum group size threshold, as very small group size may be a contributing factor in model error rates:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr', min_group_size=0.05)

The crosstab dataframe is augmented by every succeeding class with additional layers of information about biases, starting with bias disparities in the Bias() class. There are three get_disparity functions, one for each of the three ways to select a reference group. get_disparity_min_metric() and get_disparity_major_group() methods calculate a reference group automatically based on your data, while the user specifies reference groups for get_disparity_predefined_groups().

    from aequitas.bias import Bias
    
    b = Bias()
    bdf = b.get_disparity_predefined_groups(xtab, 
                        original_df=df, 
                        ref_groups_dict={'race':'Caucasian', 'sex':'Male', 'age_cat':'25 - 45'}, 
                        alpha=0.05, 
                        check_significance=False)

Learn more about reference group selection.

The Plot() class visualizes disparities as treemaps colored based on disparity relationship between a given group and the reference group with plot_disparity() or multiple with plot_disparity_all(). Saturation is determined by a given fairness threshold.

Let's look at False Positive Rate Disparity.

    fpr_disparity = aqp.plot_disparity(bdf, group_metric='fpr_disparity', 
                                       attribute_name='race')

Now you're ready to obtain metric parities with the Fairness() class:

    from aequitas.fairness import Fairness
    
    f = Fairness()
    fdf = f.get_group_value_fairness(bdf)

You now have parity determinations for your models that can be leveraged in model selection! If a specific bias metric for a group falls within a given percentage (based on the fairness threshold) of the reference group, the fairness determination is 'True.'

To determine whether group False Positive Rates fall within the "fair" range, use Plot() class fairness methods:

    fpr_fairness = aqp.plot_fairness_group(fdf, group_metric='fpr', title=True)

To quickly review False Positive Rate Disparity fairness determinations, we can use Plot() class fairness_disparity() methods:

    fpr_disparity_fairness = aqp.plot_fairness_disparity(fdf, group_metric='fpr', attribute_name='race')

Input Data

In general, input data is a single table with the following columns:

  • score
  • label_value (for error-based metrics only)
  • at least one attribute e.g. race, sex and age_cat (attribute categories defined by user)
score label_value race sex age income
0 1 African-American Female 27 18000
1 1 Caucasian Male 32

Back to 30 Seconds to Aequitas

Input data for Webapp

The webapp requires a single CSV with columns for a binary score, a binary label_value and an arbitrary number of attribute columns. Each row is associated with a single observation.

score

Aequitas webapp assumes the score column is a binary decision (0 or 1).

label_value

This is the ground truth value of a binary decision. The data again must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles and then create crosstabs with the newly defined categories.

Back to 30 Seconds to Aequitas

Input data for CLI

The CLI accepts CSV files and accommodates database calls defined in Configuration files.

score

By default, Aequitas CLI assumes the score column is a binary decision (0 or 1). Alternatively, the score column can contain the score (e.g. the output from a logistic regression applied to the data). In this case, the user sets a threshold to determine the binary decision. See configurations for more on thresholds.

label_value

As with the webapp, this is the ground truth value of a binary decision. The data must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group value. If continuous, Aequitas will first bin the data into quartiles.

model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Input data for Python API

Python input data can be handled identically to CLI by using preprocess_input_df(). Otherwise, you must discretize continuous attribute columns prior to passing the data to Group().get_crosstabs().

    from Aequitas.preprocessing import preprocess_input_df()
    # *input_data* matches CLI input data norms.
    df, _ = preprocess_input_df(*input_data*)

score

By default, Aequitas assumes the score column is a binary decision (0 or 1). If the score column contains a non-binary score (e.g. the output from a logistic regression applied to the data), the user sets a threshold to determine the binary decision. Thresholds are set in a dictionary passed to get_crosstabs() of format {'rank_abs':[300] , 'rank_pct':[1.0, 5.0, 10.0]}. See configurations for more on thresholds.

label_value

This is the ground truth value of a binary decision. The data must be binary (0 or 1).

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string'. This excludes the pandas 'categorical' data type, which is the default output of certain pandas discretizing functions. You can recast 'categorical' columns to strings:

   df['categorical_column_name'] = df['categorical_column_name'].astype(str)
model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Development

Provision your development environment via the shell script develop:

./develop

Common development tasks, such as deploying the webapp, may then be handled via manage:

manage --help

Citing Aequitas

If you use Aequitas in a scientific publication, we would appreciate citations to the following paper:

Pedro Saleiro, Benedict Kuester, Abby Stevens, Ari Anisfeld, Loren Hinkson, Jesse London, Rayid Ghani, Aequitas: A Bias and Fairness Audit Toolkit, arXiv preprint arXiv:1811.05577 (2018). (PDF)

   @article{2018aequitas,
     title={Aequitas: A Bias and Fairness Audit Toolkit},
     author={Saleiro, Pedro and Kuester, Benedict and Stevens, Abby and Anisfeld, Ari and Hinkson, Loren and London, Jesse and Ghani, Rayid}, journal={arXiv preprint arXiv:1811.05577}, year={2018}}
L2X - Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation.

L2X Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation at ICML 2018,

Jianbo Chen 113 Sep 06, 2022
A library that implements fairness-aware machine learning algorithms

Themis ML themis-ml is a Python library built on top of pandas and sklearnthat implements fairness-aware machine learning algorithms. Fairness-aware M

Niels Bantilan 105 Dec 30, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Webis 42 Aug 14, 2022
Implementation of linear CorEx and temporal CorEx.

Correlation Explanation Methods Official implementation of linear correlation explanation (linear CorEx) and temporal correlation explanation (T-CorEx

Hrayr Harutyunyan 34 Nov 15, 2022
treeinterpreter - Interpreting scikit-learn's decision tree and random forest predictions.

TreeInterpreter Package for interpreting scikit-learn's decision tree and random forest predictions. Allows decomposing each prediction into bias and

Ando Saabas 720 Dec 22, 2022
A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

Souvik Pratiher 16 Nov 17, 2021
Interactive convnet features visualization for Keras

Quiver Interactive convnet features visualization for Keras The quiver workflow Video Demo Build your model in keras model = Model(...) Launch the vis

Keplr 1.7k Dec 21, 2022
ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments.

ModelChimp What is ModelChimp? ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments. ModelChimp provides the followi

ModelChimp 124 Dec 21, 2022
Visual analysis and diagnostic tools to facilitate machine learning model selection.

Yellowbrick Visual analysis and diagnostic tools to facilitate machine learning model selection. What is Yellowbrick? Yellowbrick is a suite of visual

District Data Labs 3.9k Dec 30, 2022
pytorch implementation of "Distilling a Neural Network Into a Soft Decision Tree"

Soft-Decision-Tree Soft-Decision-Tree is the pytorch implementation of Distilling a Neural Network Into a Soft Decision Tree, paper recently published

Kim Heecheol 262 Dec 04, 2022
👋🦊 Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

👋🦊 Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

DEEL 343 Jan 02, 2023
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Scott Lundberg 18.3k Jan 08, 2023
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022
Model analysis tools for TensorFlow

TensorFlow Model Analysis TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models. It allows users to evaluate their models on

1.2k Dec 26, 2022
🎆 A visualization of the CapsNet layers to better understand how it works

CapsNet-Visualization For more information on capsule networks check out my Medium articles here and here. Setup Use pip to install the required pytho

Nick Bourdakos 387 Dec 06, 2022
Interpretability and explainability of data and machine learning models

AI Explainability 360 (v0.2.1) The AI Explainability 360 toolkit is an open-source library that supports interpretability and explainability of datase

1.2k Dec 29, 2022
GNNLens2 is an interactive visualization tool for graph neural networks (GNN).

GNNLens2 is an interactive visualization tool for graph neural networks (GNN).

Distributed (Deep) Machine Learning Community 143 Jan 07, 2023
JittorVis - Visual understanding of deep learning model.

JittorVis - Visual understanding of deep learning model.

182 Jan 06, 2023
A collection of research papers and software related to explainability in graph machine learning.

A collection of research papers and software related to explainability in graph machine learning.

AstraZeneca 1.9k Dec 26, 2022
Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

Lim Swee Kiat 520 Dec 26, 2022