Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Last update: Nov 04, 2022

Related tags

Data Analysis elicited

Overview

Elicited

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Credit to Brett Hoover, packaging by @magoo

Usage

pip install elicited

import elicited as e

elicited is just a helper tool when using numpy and scipy, so you'll need these in your code.

import numpy as np
from scipy.stats import poisson, zipf, beta, pareto, lognorm

Lognormal

See Occurance and Applications for examples of lognormal distributions in nature.

Expert: Most customers hold around $20K (mode) but I could imagine a customer with $2.5M (max)

mode = 20000
max = 2500000

mean, stdv = e.elicitLogNormal(mode, max)
asset_values = lognorm(s=stdv, scale=np.exp(mean))
asset_values.rvs(100)

Pareto

The 80/20 rule. See Occurance and Applications

Expert: The legal costs of an incident could be devastating. Typically costs are almost zero (val_min) but a black swan could be $100M (val_max).

b = e.elicitPareto(val_min, val_max)
p = pareto(b, loc=val_min-1., scale=1.))

PERT

See PERT Distribution

Expert: Our customers have anywhere from $500-$6000 (val_min / val_max), but it's most typically around $4500 (val_mod)

PERT_a, PERT_b = e.elicitPERT(val_min, val_mod, val_max)
pert = beta(PERT_a, PERT_b, loc=val_min, scale=val_max-val_min)

Zipf's

See Applications

Expert: If we get sued, there will only be a few litigants (nMin). Very rarely it could be 30 or more litigants (nMax), maybe once every thousand cases (pMax) it would be more.

nMin = 1
nMax = 30
pMax = 1/1000

Zs = e.elicitZipf(nMin, nMax, pMax, report=True)

litigants = zipf(Zs, nMin-1)

litigants.rvs(100)

Reference: Other Useful Elicitations

Listed as a courtesy, these distributions are simple enough to elicit data into directly without a helper function.

Uniform

A "zero knowledge" distribution where all values within the range have equal probability of appearing. Similar to random.randint(a, b)

Expert: The crowd will be between 50 (min) and 500 (max) due to fire code restrictions and the existing residents in the building.

from scipy.stats import uniform

min = 50
max = 500

range = max - min

crowd_size = uniform(min, range)
crowd_size.rvs(100)

Poisson

Expert: About 3000 Customers (average) add a credit card to their account every quarter.

from scipy.stats import poisson
average = 3000
upsells = poisson(average)
upsells.rvs(100)

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Related tags

Overview

Elicited

Usage

Lognormal

Pareto

PERT

Zipf's

Reference: Other Useful Elicitations

Uniform

Poisson

Owner

Ryan McGeehan

Very useful and necessary functions that simplify working with data

Additional tools for particle accelerator data analysis and machine information

Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

University Challenge 2021 With Python

fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

INFO-H515 - Big Data Scalable Analytics

This python script allows you to manipulate the audience data from Sl.ido surveys

Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python

track your GitHub statistics

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

follow-analyzer helps GitHub users analyze their following and followers relationship

A set of functions and analysis classes for solvation structure analysis

Python beta calculator that retrieves stock and market data and provides linear regressions.

Important dataframe statistics with a single command

ped-crash-techvol: Texas Ped Crash Tech Volume Pack

Utilize data analytics skills to solve real-world business problems using Humana’s big data

Lale is a Python library for semi-automated data science.

Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Catalogue data - A Python Scripts to prepare catalogue data