Program that predicts the NBA mvp based on data from previous years.

Last update: Jan 21, 2022

Related tags

Overview

NBA MVP Predictor

A machine learning model using RandomForest Regression that predicts NBA MVP's using player data.
Explore the docs »

View Demo · Report Bug · Request Feature

About The Project

This project utilizes RandomForest Regression ML model to predict the NBA MVP. Now you may think that this is not a regression problem, but more of a classification problem, however our approach to predicting MVP consists of predicting a numerical variable called MVP win share. From that prediction, the player in the season with the highest MVP win share is predicted to be the MVP. As you can see structuring the problem like this lends more towards a regression solution.

Our machine learning model is trained on data from 1980-2010, and then we use that to predict the MVP's for the 2011-2021 season.

(back to top)

Built With

(back to top)

Examples of Graphs Used

Usage

To run this model on your system, download the jupyter notebook, and data. Then within the file change the URL for the raw_mvp_data variable to the path where the data is located on your system.

Results

The model achieved an R^2 value of 0.6127, guessing 8/10 of it's predictions correctly.

Acknowledgements

Inspiration from this article: https://towardsdatascience.com/predicting-the-next-nba-mvp-using-machine-learning-62615bfcff75

Program that predicts the NBA mvp based on data from previous years.

Related tags

Overview

NBA MVP Predictor

About The Project

Built With

Examples of Graphs Used

Usage

Results

Acknowledgements

Owner

Muhammad Rabee

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

Titanic data analysis for python

A tool to compare differences between dataframes and create a differences report in Excel

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

A utility for functional piping in Python that allows you to access any function in any scope as a partial.

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

University Challenge 2021 With Python

High Dimensional Portfolio Selection with Cardinality Constraints

Airflow ETL With EKS EFS Sagemaker

TextDescriptives - A Python library for calculating a large variety of statistics from text

Python Implementation of Scalable In-Memory Updatable Bitmap Indexing

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets

Automated Exploration Data Analysis on a financial dataset

PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

Projects that implement various aspects of Data Engineering.

:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

The OHSDI OMOP Common Data Model allows for the systematic analysis of healthcare observational databases.

Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

Toolchest provides APIs for scientific and bioinformatic data analysis.