This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Overview

Stability Audit

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems, Humantic AI and Crystal. This codebase supports the 2021 manuscript entitled "External Stability Auditing to Test the Validity of Personality Prediction in AI Hiring," authored by Alene K. Rhea, Kelsey Markey, Lauren D'Arinzo, Hilke Schellmann, Mona Sloane, Paul Squires, and Julia Stoyanovich.

Code

The Jupyter notebook analysis.ipynb reads in the survey and system output data, and performs all stability analysis. The notebook begins with a demographic summarization, and then estimates stability metrics for each facet experiment as described in the manuscript.

Spearman's rank correlation is used to measure rank-order stability, two-tailed Wilcoxon signed rank testing is used to measure locational stability, and normalized L1 distance is used to measure total change across each facet. Medians of each facet treatment are estimated as well. Results are saved to the results directory, organized by metric and by system (Humantic AI and Crystal). Subgroup analysis is performed for rank-order stability and total change. Highlighting is employed to indicate correlations below 0.95 and 0.90, and Wilcoxon p-values below the Bonferroni and Benjamini-Hochberg corrected thresholds. Scatterplots are produced to compare the outputs from each pair of facet treatments. Boxplots illustrate total change. Boxplots comparing relevant subgroup analysis for each facet are produced as well.

Data

Survey

Anonymized survey results are saved in data/survey.csv. Columns described in the table below.

Column Type Description Values
Participant_ID str Unique ID used to identify participant. "ID2" - "ID101" (missing IDs indicate potential subjects were screened out of participation)
gender str Participant gender, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. ["Male" "Female" "Other Gender"]
race str Participant race, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to self-identify their race in the survey. ["Asian" "White" "Other Race" NaN]
birth_country str Participant birth country, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to provide their birth country in the survey. ["China" "India" "USA" "Other Country" NaN]
primary_language str Primary language of participant, as reported in the survey. ["English" "Other Langauge"]
resume bool Boolean flag indicating whether participant provided a resume in the survey. ["True" "False"]
linkedin bool Boolean flag indicating whether participant provided a LinkedIn in the survey. ["True" "False"]
twitter bool Boolean flag indicating whether participant provided a public Twitter handle in the survey. ["True" "False"]
linkedin_in_orig_resume bool Boolean flag indicating whether participant included a reference to their LinkedIn in the resume they submitted. Empty entries indicate participants did not submit a resume. ["True" "False" NaN]
orig_embed_type str Description of the method by which the participant referenced their LinkedIn in their submitted resume. Empty entries indicate participant did not submit a resume containing a reference to LinkedIn. ["Full url hyperlinked" "Full url not hyperlinked" "Text hyperlinked" "Other not hyperlinked" NaN]
orig_file_type str Filetype of the resume submitted by the participant. Empty entries indicate participants did not submit a resume. ["pdf" "docx" "txt" NaN]

Humantic AI and Crystal Output

Output from Humantic AI and Crystal is saved in the data directory. Each run is saved as a CSV and is named with its Run ID. Tables 3 and 4 in the manuscript (reproduced below) provide details of each run. Each file contains one row for each submitted input. Participant_ID provides a unique key, and output_success is a Boolean flag indicating that the system successfully produced output from the given input. Wherever output_success is true, there will be numeric predictions for each trait. Crystal results contain predictions for DiSC traits, and Humantic AI results contain predictions for DiSC traits and Big Five traits.

Run ID System Description Run Dates
HRo1 Humantic AI Original Resume 11/23/2020 - 01/14/2021
HRi1 Humantic AI De-Identified Resume 03/20/2021 - 03/28/2021
HRi2 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRi3 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRd1 Humantic AI DOCX Resume 03/20/2021 - 03/28/2021
HRu1 Humantic AI URL-Embedded Resume 04/09/2021 - 04/11/2021
HL1 Humantic AI LinkedIn 11/23/2020 - 01/14/2021
HL2 Humantic AI LinkedIn 08/10/2021 - 08/11/2021
HT1 Humantic AI Twitter 11/23/2020 - 01/14/2021
HT2 Humantic AI Twitter 08/10/2021 - 08/11/2021
CRr1 Crystal Raw Text Resume 03/31/2021 - 04/02/2021
CRr2 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRr3 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRp1 Crystal PDF Resume 11/23/2020 - 01/14/2021
CL1 Crystal LinkedIn 11/23/2020 - 01/14/2021
CL2 Crystal LinkedIn 09/13/2020 - 09/16/2021
Owner
Data, Responsibly
responsible data management: platform and tools
Data, Responsibly
Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out) created with Python.

Hand Gesture Volume Controller Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out). Code Firstly I have created a

Tejas Prajapati 16 Sep 11, 2021
Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

LearningPatches | Webpage | Paper | Video Learning Manifold Patch-Based Representations of Man-Made Shapes Dmitriy Smirnov, Mikhail Bessmeltsev, Justi

Dima Smirnov 22 Nov 14, 2022
This repository contains an implementation of the Permutohedral Attention Module in Pytorch

Permutohedral_attention_module This repository contains an implementation of the Permutohedral Attention Module

Samuel JOUTARD 26 Nov 27, 2022
Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

SALOD Source code of our work: "Benchmarking Deep Models for Salient Object Detection". In this works, we propose a new benchmark for SALient Object D

22 Dec 30, 2022
Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

DeepCurrents | Webpage | Paper DeepCurrents: Learning Implicit Representations of Shapes with Boundaries David Palmer*, Dmitriy Smirnov*, Stephanie Wa

Dima Smirnov 36 Dec 08, 2022
[ICML 2021] Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data

Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data This repo provides the source code & data of our paper: Break-It-Fix-It: Unsupervised

Michihiro Yasunaga 86 Nov 30, 2022
Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

pixel_character_generator Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. Dataset TinyHero D

Agnieszka Mikołajczyk 88 Nov 17, 2022
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding PyTorch implementation for the Scalable Attentive Sentence-Pair Modeling vi

Microsoft 25 Dec 02, 2022
implementation for paper "ShelfNet for fast semantic segmentation"

ShelfNet-lightweight for paper (ShelfNet for fast semantic segmentation) This repo contains implementation of ShelfNet-lightweight models for real-tim

Juntang Zhuang 252 Sep 16, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Seunghyun Lee 12 Oct 18, 2022
Local Multi-Head Channel Self-Attention for FER2013

LHC-Net Local Multi-Head Channel Self-Attention This repository is intended to provide a quick implementation of the LHC-Net and to replicate the resu

12 Jan 04, 2023
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 09, 2023
A simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

This is a simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

crispengari 3 Jan 08, 2022
Automatically align face images 🙃→🙂. Can also do windowing and warping.

Automatic Face Alignment (AFA) Carl M. Gaspar & Oliver G.B. Garrod You have lots of photos of faces like this: But you want to line up all of the face

Carl Michael Gaspar 15 Dec 12, 2022
Code for MSc Quantitative Finance Dissertation

MSc Dissertation Code ReadMe Sector Volatility Prediction Performance Using GARCH Models and Artificial Neural Networks Curtis Nybo MSc Quantitative F

2 Dec 01, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Graph-to-Tree Learning for Solving Math Word Problems PyTorch implementation of Graph based Math Word Problem solver described in our ACL 2020 paper G

Jipeng Zhang 66 Nov 23, 2022
Code release for The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification (TIP 2020)

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification Code release for The Devil is in the Channels: Mutual-Channel

PRIS-CV: Computer Vision Group 230 Dec 31, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022