The repo for mlbtradetrees.com. Analyze any trade in baseball history!

Overview

MLB Trade Trees

2.0.0 Release: November 24, 2021

www.mlbtradetrees.com allows you to view the trade tree of any player in MLB history.

What is a trade tree?

A trade tree will show you the complete details of a trade made by a team. Let's use Hall Of Fame candidate Cliff Lee for some examples, as he was traded multiple times throughout his career..

Here is the simplest form of his tree: Cliff Lee Phils

Cliff Lee was traded to the Mariners in 2009, and the Phillies received 3 players in return. All players the Phillies received in return either retired or became free agents, ending the tree with them.

Let's take a look at a more complicated example:

Cliff Lee Phils

We can see the Mariners traded away Cliff Lee in 2010, receiving 4 players in return. 2 Players' lines end due to free agency and being picked up on waivers. 2 players' lines continue due to being traded away the next year. Some of those players' lines end however some continue to be traded away, so the tree grows. The tree finally ends in 2014 due to the final player hitting free agency.

Some of these trees can get pretty massive, spanning decades and dozens of trades. An example is Harry Simpson.

The Database

The transaction, team and player databases are thanks to Retrosheet. I will only update transactions when they update the database.

I have made some adjustments to the database that allows the search to go more smoothly:

Transaction database (data/sorted_transactions_final.csv)

  • Nan players involved in trades were changed to "PTBNL/Cash" (player to be named later). Most of the time you see this in a tree, it is a cash transaction.
  • Transactions of players that were released or granted free agency, then signed back with the team as their next transaction were deleted as it caused trees to end prematurely.
  • Franchise tags were added to the database to ensure that a team name change doesn't end a tree.

Team database (data/teams.csv)

  • All teams in the database received a franchise tag if they are part of the same franchise. They received a unique franchise code if they are an independant team.

Player database (data/teams.csv)

  • Nothing changed, just made a copy with the full name to easily get the user input. (static/css/searchable_players.csv)

Installing Locally

If you want to run the website locally:

  • install flask
  • install pandas
  • install JSGlue (allows Jinja to work in a js file)

Run server.py

What am I working on?

Updated Nov. 24 2021

  • Some players don't display properly due to having very old teams not listed in the teams database. Usually these are players before 1920. I just need to update the transactions database to find all teams without the franchise tag.

  • Adding stat support with pybaseball. I'd like to add total war contributed by players in a trade on the tree.

  • Searching for and filtering trees based on team, year, players in a tree, length of trees, etc.

  • Various UI enhancements, like clickable nodes to get a player's tree, collapsable nodes for easier readability.

Retail-Sim is python package to easily create synthetic dataset of retaile store.

Retailer's Sale Data Simulation Retail-Sim is python package to easily create synthetic dataset of retaile store. Simulation Model Simulator consists

Corca AI 7 Sep 30, 2022
Anomaly Detection with R

AnomalyDetection R package AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the pre

Twitter 3.5k Dec 27, 2022
pipeline for migrating lichess data into postgresql

How Long Does It Take Ordinary People To "Get Good" At Chess? TL;DR: According to 5.5 years of data from 2.3 million players and 450 million games, mo

Joseph Wong 182 Nov 11, 2022
Full automated data pipeline using docker images

Create postgres tables from CSV files This first section is only relate to creating tables from CSV files using postgres container alone. Just one of

1 Nov 21, 2021
Py-price-monitoring - A Python price monitor

A Python price monitor This project was focused on Brazil, so the monitoring is

Samuel 1 Jan 04, 2022
Pipeline to convert a haploid assembly into diploid

HapDup (haplotype duplicator) is a pipeline to convert a haploid long read assembly into a dual diploid assembly. The reconstructed haplotypes

Mikhail Kolmogorov 50 Jan 05, 2023
MDAnalysis is a Python library to analyze molecular dynamics simulations.

MDAnalysis Repository README [*] MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale,

MDAnalysis 933 Dec 28, 2022
Office365 (Microsoft365) audit log analysis tool

Office365 (Microsoft365) audit log analysis tool The header describes it all WHY?? The first line of code was written long time before other colleague

Anatoly 1 Jul 27, 2022
ETL pipeline on movie data using Python and postgreSQL

Movies-ETL ETL pipeline on movie data using Python and postgreSQL Overview This project consisted on a automated Extraction, Transformation and Load p

Juan Nicolas Serrano 0 Jul 07, 2021
Developed for analyzing the covariance for OrcVIO

about This repo is developed for analyzing the covariance for OrcVIO environment setup platform ubuntu 18.04 using conda conda env create --file envir

Sean 1 Dec 08, 2021
Pyspark Spotify ETL

This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data

16 Jun 09, 2022
ASOUL直播间弹幕抓取&&数据分析

ASOUL直播间弹幕抓取&&数据分析(更新中) 这些文件用于爬取ASOUL直播间的弹幕(其他直播间也可以)和其他信息,以及简单的数据分析生成。

159 Dec 10, 2022
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw

Rick Lamers 1 Nov 17, 2021
Very useful and necessary functions that simplify working with data

Additional-function-for-pandas Very useful and necessary functions that simplify working with data random_fill_nan(module_name, nan) - Replaces all sp

Alexander Goldian 2 Dec 02, 2021
AWS Glue ETL Code Samples

AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit

AWS Samples 1.2k Jan 03, 2023
A variant of LinUCB bandit algorithm with local differential privacy guarantee

Contents LDP LinUCB Description Model Architecture Dataset Environment Requirements Script Description Script and Sample Code Script Parameters Launch

Weiran Huang 4 Oct 25, 2022
My solution to the book A Collection of Data Science Take-Home Challenges

DS-Take-Home Solution to the book "A Collection of Data Science Take-Home Challenges". Note: Please don't contact me for the dataset. This repository

Jifu Zhao 1.5k Jan 03, 2023
PyChemia, Python Framework for Materials Discovery and Design

PyChemia, Python Framework for Materials Discovery and Design PyChemia is an open-source Python Library for materials structural search. The purpose o

Materials Discovery Group 61 Oct 02, 2022
A Python adaption of Augur to prioritize cell types in perturbation analysis.

A Python adaption of Augur to prioritize cell types in perturbation analysis.

Theis Lab 2 Mar 29, 2022
An Aspiring Drop-In Replacement for NumPy at Scale

Legate NumPy is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the NumPy API on top of the Legion runtime. Using Legate NumPy you do things like run the f

Legate 502 Jan 03, 2023