Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Last update: Nov 04, 2022

Related tags

Data Analysis elicited

Overview

Elicited

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Credit to Brett Hoover, packaging by @magoo

Usage

pip install elicited

import elicited as e

elicited is just a helper tool when using numpy and scipy, so you'll need these in your code.

import numpy as np
from scipy.stats import poisson, zipf, beta, pareto, lognorm

Lognormal

See Occurance and Applications for examples of lognormal distributions in nature.

Expert: Most customers hold around $20K (mode) but I could imagine a customer with $2.5M (max)

mode = 20000
max = 2500000

mean, stdv = e.elicitLogNormal(mode, max)
asset_values = lognorm(s=stdv, scale=np.exp(mean))
asset_values.rvs(100)

Pareto

The 80/20 rule. See Occurance and Applications

Expert: The legal costs of an incident could be devastating. Typically costs are almost zero (val_min) but a black swan could be $100M (val_max).

b = e.elicitPareto(val_min, val_max)
p = pareto(b, loc=val_min-1., scale=1.))

PERT

See PERT Distribution

Expert: Our customers have anywhere from $500-$6000 (val_min / val_max), but it's most typically around $4500 (val_mod)

PERT_a, PERT_b = e.elicitPERT(val_min, val_mod, val_max)
pert = beta(PERT_a, PERT_b, loc=val_min, scale=val_max-val_min)

Zipf's

See Applications

Expert: If we get sued, there will only be a few litigants (nMin). Very rarely it could be 30 or more litigants (nMax), maybe once every thousand cases (pMax) it would be more.

nMin = 1
nMax = 30
pMax = 1/1000

Zs = e.elicitZipf(nMin, nMax, pMax, report=True)

litigants = zipf(Zs, nMin-1)

litigants.rvs(100)

Reference: Other Useful Elicitations

Listed as a courtesy, these distributions are simple enough to elicit data into directly without a helper function.

Uniform

A "zero knowledge" distribution where all values within the range have equal probability of appearing. Similar to random.randint(a, b)

Expert: The crowd will be between 50 (min) and 500 (max) due to fire code restrictions and the existing residents in the building.

from scipy.stats import uniform

min = 50
max = 500

range = max - min

crowd_size = uniform(min, range)
crowd_size.rvs(100)

Poisson

Expert: About 3000 Customers (average) add a credit card to their account every quarter.

from scipy.stats import poisson
average = 3000
upsells = poisson(average)
upsells.rvs(100)

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Related tags

Overview

Elicited

Usage

Lognormal

Pareto

PERT

Zipf's

Reference: Other Useful Elicitations

Uniform

Poisson

Owner

Ryan McGeehan

Display the behaviour of a realtime program with a scope or logic analyser.

Python package for analyzing sensor-collected human motion data

Analyzing Earth Observation (EO) data is complex and solutions often require custom tailored algorithms.

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.

Randomisation-based inference in Python based on data resampling and permutation.

A computer algebra system written in pure Python

CubingB is a timer/analyzer for speedsolving Rubik's cubes, with smart cube support

Implementation in Python of the reliability measures such as Omega.

ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Picka: A Python module for data generation and randomization.

PyIOmica (pyiomica) is a Python package for omics analyses.

Efficient matrix representations for working with tabular data

Galvanalyser is a system for automatically storing data generated by battery cycling machines in a database

yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.

Import, connect and transform data into Excel

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN

Python reader for Linked Data in HDF5 files

Useful tool for inserting DataFrames into the Excel sheet.