Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Last update: Apr 21, 2022

Related tags

Machine Learning Feature-Engineering

Overview

Feature-Engineering

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared.

When the dataset is passed through this script, the modeling starts. expected to be ready.

Dataset Story

The data set is the data set of the people who were in the Titanic shipwreck. It consists of 768 observations and 12 variables. The target variable is specified as "Survived"; 1: one's survival, 0: indicates the person's inability to survive.

Variables

PassengerId: ID of the passenger

Survived: Survival status (0: not survived, 1: survived)
Pclass: Ticket class (1: 1st class (upper), 2: 2nd class (middle), 3: 3rd class(lower))
Name: Name of the passenger
Sex: Gender of the passenger (male, female)
Age: Age in years
Sibsp: Number of siblings/spouses aboard the Titanic
- Sibling = Brother, sister, stepbrother, stepsister
- Spouse = Husband, wife (mistresses and fiances were ignored) Parch: Number of parents/children aboard the Titanic
- Parent = Mother, father
- Child = Daughter, son, stepdaughter, stepson
- Some children travelled only with a nanny , therefore Parch = 0 for them.
Ticket: Ticket number
Fare: Passenger fare
Cabin: Cabin number
Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

REFERENCE: Data Science and ML Boot Camp, 2021, Veri Bilimi Okulu (https://www.veribilimiokulu.com/)

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Related tags

Overview

Feature-Engineering

Dataset Story

Variables

Owner

kemalgunay

Summer: compartmental disease modelling in Python

Cohort Intelligence used to solve various mathematical functions

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

💀mummify: a version control tool for machine learning

Simple Machine Learning Tool Kit

Conducted ANOVA and Logistic regression analysis using matplot library to visualize the result.

A library to generate synthetic time series data by easy-to-use factors and generator

Uplift modeling and causal inference with machine learning algorithms

Toolss - Automatic installer of hacking tools (ONLY FOR TERMUKS!)

Machine Learning from Scratch

This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

Tools for Optuna, MLflow and the integration of both.

🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Katana project is a template for ASAP 🚀 ML application deployment

A logistic regression model for health insurance purchasing prediction

Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.

Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification

Decision Weights in Prospect Theory

using Machine Learning Algorithm to classification AppleStore application

Pandas-method-chaining is a plugin for flake8 that provides method chaining linting for pandas code