This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Overview

Stability Audit

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems, Humantic AI and Crystal. This codebase supports the 2021 manuscript entitled "External Stability Auditing to Test the Validity of Personality Prediction in AI Hiring," authored by Alene K. Rhea, Kelsey Markey, Lauren D'Arinzo, Hilke Schellmann, Mona Sloane, Paul Squires, and Julia Stoyanovich.

Code

The Jupyter notebook analysis.ipynb reads in the survey and system output data, and performs all stability analysis. The notebook begins with a demographic summarization, and then estimates stability metrics for each facet experiment as described in the manuscript.

Spearman's rank correlation is used to measure rank-order stability, two-tailed Wilcoxon signed rank testing is used to measure locational stability, and normalized L1 distance is used to measure total change across each facet. Medians of each facet treatment are estimated as well. Results are saved to the results directory, organized by metric and by system (Humantic AI and Crystal). Subgroup analysis is performed for rank-order stability and total change. Highlighting is employed to indicate correlations below 0.95 and 0.90, and Wilcoxon p-values below the Bonferroni and Benjamini-Hochberg corrected thresholds. Scatterplots are produced to compare the outputs from each pair of facet treatments. Boxplots illustrate total change. Boxplots comparing relevant subgroup analysis for each facet are produced as well.

Data

Survey

Anonymized survey results are saved in data/survey.csv. Columns described in the table below.

Column Type Description Values
Participant_ID str Unique ID used to identify participant. "ID2" - "ID101" (missing IDs indicate potential subjects were screened out of participation)
gender str Participant gender, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. ["Male" "Female" "Other Gender"]
race str Participant race, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to self-identify their race in the survey. ["Asian" "White" "Other Race" NaN]
birth_country str Participant birth country, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to provide their birth country in the survey. ["China" "India" "USA" "Other Country" NaN]
primary_language str Primary language of participant, as reported in the survey. ["English" "Other Langauge"]
resume bool Boolean flag indicating whether participant provided a resume in the survey. ["True" "False"]
linkedin bool Boolean flag indicating whether participant provided a LinkedIn in the survey. ["True" "False"]
twitter bool Boolean flag indicating whether participant provided a public Twitter handle in the survey. ["True" "False"]
linkedin_in_orig_resume bool Boolean flag indicating whether participant included a reference to their LinkedIn in the resume they submitted. Empty entries indicate participants did not submit a resume. ["True" "False" NaN]
orig_embed_type str Description of the method by which the participant referenced their LinkedIn in their submitted resume. Empty entries indicate participant did not submit a resume containing a reference to LinkedIn. ["Full url hyperlinked" "Full url not hyperlinked" "Text hyperlinked" "Other not hyperlinked" NaN]
orig_file_type str Filetype of the resume submitted by the participant. Empty entries indicate participants did not submit a resume. ["pdf" "docx" "txt" NaN]

Humantic AI and Crystal Output

Output from Humantic AI and Crystal is saved in the data directory. Each run is saved as a CSV and is named with its Run ID. Tables 3 and 4 in the manuscript (reproduced below) provide details of each run. Each file contains one row for each submitted input. Participant_ID provides a unique key, and output_success is a Boolean flag indicating that the system successfully produced output from the given input. Wherever output_success is true, there will be numeric predictions for each trait. Crystal results contain predictions for DiSC traits, and Humantic AI results contain predictions for DiSC traits and Big Five traits.

Run ID System Description Run Dates
HRo1 Humantic AI Original Resume 11/23/2020 - 01/14/2021
HRi1 Humantic AI De-Identified Resume 03/20/2021 - 03/28/2021
HRi2 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRi3 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRd1 Humantic AI DOCX Resume 03/20/2021 - 03/28/2021
HRu1 Humantic AI URL-Embedded Resume 04/09/2021 - 04/11/2021
HL1 Humantic AI LinkedIn 11/23/2020 - 01/14/2021
HL2 Humantic AI LinkedIn 08/10/2021 - 08/11/2021
HT1 Humantic AI Twitter 11/23/2020 - 01/14/2021
HT2 Humantic AI Twitter 08/10/2021 - 08/11/2021
CRr1 Crystal Raw Text Resume 03/31/2021 - 04/02/2021
CRr2 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRr3 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRp1 Crystal PDF Resume 11/23/2020 - 01/14/2021
CL1 Crystal LinkedIn 11/23/2020 - 01/14/2021
CL2 Crystal LinkedIn 09/13/2020 - 09/16/2021
Owner
Data, Responsibly
responsible data management: platform and tools
Data, Responsibly
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
YOLO-v5 기반 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adaptive Cruise Control 기능 구현

자율 주행차의 영상 기반 차간거리 유지 개발 Table of Contents 프로젝트 소개 주요 기능 시스템 구조 디렉토리 구조 결과 실행 방법 참조 팀원 프로젝트 소개 YOLO-v5 기반으로 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adap

14 Jun 29, 2022
Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

line scanning repository This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza

Jurjen Heij 4 Sep 14, 2022
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

Alan Patterson 306 Dec 02, 2022
The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

Aljaz Bozic 134 Dec 16, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 885 Jan 01, 2023
Combinatorially Hard Games where the levels are procedurally generated

puzzlegen Implementation of two procedurally simulated environments with gym interfaces. IceSlider: the agent needs to reach and stop on the pink squa

Autonomous Learning Group 3 Jun 26, 2022
Python implementation of a live deep learning based age/gender/expression recognizer

TUT live age estimator Python implementation of a live deep learning based age/gender/smile/celebrity twin recognizer. All components use convolutiona

Heikki Huttunen 80 Nov 21, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

Rush Kapoor 2 Nov 21, 2022
Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Music Trees Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Ins

Hugo Flores García 32 Nov 22, 2022
Space-event-trace - Tracing service for spaceteam events

space-event-trace Tracing service for TU Wien Spaceteam events. This service is

TU Wien Space Team 2 Jan 04, 2022
Source code related to the article submitted to the International Conference on Computational Science ICCS 2022 in London

POTHER: Patch-Voted Deep Learning-based Chest X-ray Bias Analysis for COVID-19 Detection Source code related to the article submitted to the Internati

Tomasz Szczepański 1 Apr 29, 2022
Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

HOG Algorithm Implementation Description HOG (Histograms of Oriented Gradients) Algorithm is an algorithm aiming to realize object segmentation (edge

Leo Hsieh 2 Mar 12, 2022
PyTorch for Semantic Segmentation

PyTorch for Semantic Segmentation This repository contains some models for semantic segmentation and the pipeline of training and testing models, impl

Zijun Deng 1.7k Jan 06, 2023
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 04, 2021
[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

OpenCOOD OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV

Runsheng Xu 322 Dec 23, 2022
[NeurIPS 2021] "Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks" by Yonggan Fu, Qixuan Yu, Yang Zhang, Shang Wu, Xu Ouyang, David Cox, Yingyan Lin

Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks Yonggan Fu, Qixuan Yu, Yang Zhang, S

12 Dec 11, 2022
Learning Skeletal Articulations with Neural Blend Shapes

This repository provides an end-to-end library for automatic character rigging and blend shapes generation as well as a visualization tool. It is based on our work Learning Skeletal Articulations wit

Peizhuo 504 Dec 30, 2022