Tools for calculating and visualizing Elo-like ratings of MLB teams using Retosheet data

Overview

Overview

This project uses historical baseball games data to calculate an Elo-like rating for MLB teams based on regular season match ups. The Elo rating system was originally developed for ranking chess players but also can be applied to other individual and team sports. In particular it works well for measuring performance of baseball teams because of the large number of games played per season.

Wikipedia has more technical details on the rating system: https://en.m.wikipedia.org/wiki/Elo_rating_system

A similar analysis was done by 538: https://projects.fivethirtyeight.com/complete-history-of-mlb/

This repo consists of three primary pieces of code:

  • Parser.py imports and cleans the game-level data. It performs basic calculations, such as determining game winners.
  • Elo.py calculates Elo ratings for a given set of games data
  • Visualization.R plots the output in a variety of interesting ways

Data

Source: https://www.retrosheet.org/gamelogs/index.html

This data was compiled by Retosheet.org. The only stipulation for use of this data is prominent display of this statement:

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".

The formats of the data can be found in the Formats.py module. The Parser.py script imports the annual files, applies some cleaning and formatting, and stores the consolidated data in a SQLite database for later use.

Examples

This graph shows the distribution of each team's Elo rating over the course of the decade 2010-2019. The ratings are weighted by in-season days.

Distribution of Elo Ratings

This graph shows the progress of the 5 teams in the AL West over the 5-year period starting in 2014.

American League West Ratings 2014 - 2019

Each team was also ranked by their Elo relative to the rest of the league on each day. The graph below shows the time spent at each rank for the 5 teams in the AL West. The gradient indicates the years that those ranks occured.

American League West Ranks 2010 - 2019

Future improvements

  • Finish tuning parameters for best model performance
  • Expand analysis to earlier time periods (requires handling of teams entering/leaving league)
Owner
Lukas Owens
Lukas Owens
Set of matplotlib operations that are not trivial

Matplotlib Snippets This repository contains a set of matplotlib operations that are not trivial. Histograms Histogram with bins adapted to log scale

Raphael Meudec 1 Nov 15, 2021
DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.

DALLE tools DALLE-tools is a github repository with useful tools to categorize, annotate or check the sanity of your datasets. Installation Just clone

11 Dec 25, 2022
Streamlit-template - A streamlit app template based on streamlit-option-menu

streamlit-template A streamlit app template for geospatial applications based on

Qiusheng Wu 41 Dec 10, 2022
Plotting library for IPython/Jupyter notebooks

bqplot 2-D plotting library for Project Jupyter Introduction bqplot is a 2-D visualization system for Jupyter, based on the constructs of the Grammar

3.4k Dec 29, 2022
Cryptocurrency Centralized Exchange Visualization

This is a simple one that uses Grafina to visualize cryptocurrency from the Bitkub exchange. This service will make a request to the Bitkub API from your wallet and save the response to Postgresql. G

Popboon Mahachanawong 1 Nov 24, 2021
Simple and fast histogramming in Python accelerated with OpenMP.

pygram11 Simple and fast histogramming in Python accelerated with OpenMP with help from pybind11. pygram11 provides functions for very fast histogram

Doug Davis 28 Dec 14, 2022
Create artistic visualisations with your exercise data (Python version)

strava_py Create artistic visualisations with your exercise data (Python version). This is a port of the R strava package to Python. Examples Facets A

Marcus Volz 53 Dec 28, 2022
Minimalistic tool to visualize how the routes to a given target domain change over time, feat. Python 3.10 & mermaid.js

Minimalistic tool to visualize how the routes to a given target domain change over time, feat. Python 3.10 & mermaid.js

Péter Ferenc Gyarmati 1 Jan 17, 2022
`charts.css.py` brings `charts.css` to Python. Online documentation and samples is available at the link below.

charts.css.py charts.css.py provides a python API to convert your 2-dimension data lists into html snippet, which will be rendered into charts by CSS,

Ray Luo 3 Sep 23, 2021
🧇 Make Waffle Charts in Python.

PyWaffle PyWaffle is an open source, MIT-licensed Python package for plotting waffle charts. It provides a Figure constructor class Waffle, which coul

Guangyang Li 528 Jan 02, 2023
A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe which runs both payloads.

Update ! ANONFILE MIGHT NOT WORK ! About A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe w

Vesper 15 Oct 12, 2022
Manim is an animation engine for explanatory math videos.

A community-maintained Python framework for creating mathematical animations.

12.4k Dec 30, 2022
nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation.

nptsne nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation and HSNE modelling. For more detail s

Biomedical Visual Analytics Unit LUMC - TU Delft 29 Jul 05, 2022
Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Tomás Capretto 93 Dec 28, 2022
Render tokei's output to interactive sunburst chart.

Render tokei's output to interactive sunburst chart.

134 Dec 15, 2022
Param: Make your Python code clearer and more reliable by declaring Parameters

Param Param is a library providing Parameters: Python attributes extended to have features such as type and range checking, dynamically generated valu

HoloViz 304 Jan 07, 2023
BGraph is a tool designed to generate dependencies graphs from Android.bp soong files.

BGraph BGraph is a tool designed to generate dependencies graphs from Android.bp soong files. Overview BGraph (for Build-Graphs) is a project aimed at

Quarkslab 10 Dec 19, 2022
Visualizing weather changes across the world using third party APIs and Python.

WEATHER FORECASTING ACROSS THE WORLD Overview Python scripts were created to visualize the weather for over 500 cities across the world at varying di

G Johnson 0 Jun 12, 2021
Geospatial Data Visualization using PyGMT

Example script to visualize topographic data, earthquake data, and tomographic data on a map

Utpal Kumar 2 Jul 30, 2022
With Holoviews, your data visualizes itself.

HoloViews Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data a

HoloViz 2.3k Jan 04, 2023