Employee Turnover Analysis

Last update: Feb 13, 2022

Overview

Employee Turnover Analysis

Submission to the DataCamp competition "Can you help reduce employee turnover?" (https://app.datacamp.com/workspace/w/e9089f70-9c9a-4b5e-b1c2-94144ac12cd4)

Background

You work for the human capital department of a large corporation. The Board is worried about the relatively high turnover, and your team must look into ways to reduce the number of employees leaving the company. The team needs to understand better the situation, which employees are more likely to leave, and why. Once it is clear what variables impact employee churn, you can present your findings along with your ideas on how to attack the problem.

Content

The report covers the following questions:

Which department has the highest employee turnover? Which one has the lowest?
Which variables seem to be better predictors of employee departure?
How can the board reduce employee turnover?

Owner

Jannik Wiedenhaupt

MS in Data Science at Columbia University

GitHub Repository

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.

Please consider citing the manuscript if you use apricot in your academic work! You can find more thorough documentation here. apricot implements subm

457 Dec 20, 2022

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings.

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings. it also can assist the binary code analysis rese

42 Dec 16, 2022

The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

104 Nov 27, 2022

A collection of robust and fast processing tools for parsing and analyzing web archive data.

ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing web archive data. Resiliparse is part of the ChatNoir

24 Nov 29, 2022

Investigating EV charging data

Investigating EV charging data Introduction: Got an opportunity to work with a home monitoring technology company over the last 6 months whose goal wa

2 Apr 07, 2022

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

6 Nov 30, 2022

Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 01, 2023

Additional tools for particle accelerator data analysis and machine information

PyLHC Tools This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Au

3 Apr 13, 2022

Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

1.4k Jan 05, 2023

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day. Correlate the market activity with the Apple Keynote presentations.

2 Jan 04, 2022

ETL flow framework based on Yaml configs in Python

ETL framework based on Yaml configs in Python A light framework for creating data streams. Setting up streams through configuration in the Yaml file.

18 Jul 06, 2022

VevestaX is an open source Python package for ML Engineers and Data Scientists.

VevestaX Track failed and successful experiments as well as features. VevestaX is an open source Python package for ML Engineers and Data Scientists.

24 Dec 14, 2022

A pipeline that creates consensus sequences from a Nanopore reads. I

A pipeline that creates consensus sequences from a Nanopore reads. It clusters reads that are similar to each other and creates a consensus that is then identified using BLAST.

2 May 15, 2022

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge S

1 Jan 09, 2022

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

What is Retentioneering? Retentioneering is a Python framework and library to assist product analysts and marketing analysts as it makes it easier to

581 Jan 07, 2023

Employee Turnover Analysis

Related tags

Overview

Employee Turnover Analysis

Background

Content

Owner

Jannik Wiedenhaupt

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings.

The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

A collection of robust and fast processing tools for parsing and analyzing web archive data.

Investigating EV charging data

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

Create HTML profiling reports from pandas DataFrame objects

Additional tools for particle accelerator data analysis and machine information

Single-Cell Analysis in Python. Scales to >1M cells.

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.

ETL flow framework based on Yaml configs in Python

VevestaX is an open source Python package for ML Engineers and Data Scientists.

A pipeline that creates consensus sequences from a Nanopore reads. I

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

Very basic but functional Kakuro solver written in Python.

Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Python Project on Pro Data Analysis Track

Jupyter notebooks for the book "The Elements of Statistical Learning".