Project: Netflix Data Analysis and Visualization with Python

Overview

Project: Netflix Data Analysis and Visualization with Python

MyNetflixDashboard

Table of Contents

  1. General Info
  2. Installation
  3. Demo
  4. Usage and Main Functionalities
  5. Contributing

General Info

This is a compact Data Visualization project I worked on for fun and to deepen my knowledge about visualizations and graphs using python libraries.

From conception and design to every line of code, the entire Dashboard was worked on by myself. During this project, I was able to repeat and deepen what I had previously learned in my Data Science course of study. Especially, I was able to familiarize myself with pandas and work on my data visualization skills, which I greatly enjoied!

The dataset I used for the Netflix data analytics task consists of my personal Netflix data, which I requested through their website. You can get access to your own data through this link. Feel free to download it and use my code to look into your own viewing behaviour :)

Installation

Requirements: Make sure you have Python 3.7+ installed on your computer. You can download the latest version of Python here.

Req. Packages:

  • pandas
  • dash
  • dash_bootstrap_components
  • ploty.express
  • plotly.graph_objects

Demo

Demo_MyNetflixDashboard_komprimiert.mov

Usage and Main Functionalities

Want to know more about your own Netflix behaviour? For test usage you can download your own Netflix data. Just follow this link and Netflix will send you your personal data.

Please also refer to the comments within the code itself to get more information on the functionalities of the program.


0. Preparing the data for analysis

  • This part cleans up the original data and prepares it for analysis.
  • In the process, columns that are not needed are dropped.
  • Time data is converted into appropriate time formats and split into several columns. The days of the week are added.
  • In addition, the titles of the movies/series are split (title, season number, episode name).

1. Analysis

  • This part of the code is about analyzing the data.
  • We find out how many movies or series were watched over the entire period. We also count the total number of hours Netflix was watched.
  • A pie chart is created that shows which days of the week are watched.
  • In addition, the top 10 series that were watched the longest (in terms of total duration) are displayed.
  • A line chart shows Netflix viewing behavior over the years, counting the total number of hours Netflix was watched.

NetflixOverTime

2. Dash App Layout

  • plotly's Dash is now used to create an Interactive Dashboard of Netflix data.
  • The individual graphics and texts are arranged in rows and containers.
  • This part also includes a dropdown menu that the user can interact with.

3. App Callback

  • Here we connect an interactive bar chart to the Dash Components.
  • The chart represents our total annual hours of Netflix watched, grouped by month. The chart is filterable by year.

MonthlyViews

Contributing

Your comments, suggestions, and contributions are welcome. Please feel free to contribute pull requests or create issues for bugs and feature requests.

Owner
Kathrin Hälbich
Data Science Student and PR- & Marketing-Expert
Kathrin Hälbich
PyTorch implementation for NCL (Neighborhood-enrighed Contrastive Learning)

NCL (Neighborhood-enrighed Contrastive Learning) This is the official PyTorch implementation for the paper: Zihan Lin*, Changxin Tian*, Yupeng Hou* Wa

RUCAIBox 73 Jan 03, 2023
A program that uses an API and a AI model to get info of sotcks

Stock-Market-AI-Analysis I dont mind anyone using this code but please give me credit A program that uses an API and a AI model to get info of stocks

1 Dec 17, 2021
Average time per match by division

HW_02 Unzip matches.rar to access .json files for matches. Get an API key to access their data at: https://developer.riotgames.com/ Average time per m

11 Jan 07, 2022
Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods Introduction Graph Neural Networks (GNNs) have demonstrated

37 Dec 15, 2022
Convert tables stored as images to an usable .csv file

Convert an image of numbers to a .csv file This Python program aims to convert images of array numbers to corresponding .csv files. It uses OpenCV for

711 Dec 26, 2022
Synthetic Data Generation for tabular, relational and time series data.

An Open Source Project from the Data to AI Lab, at MIT Website: https://sdv.dev Documentation: https://sdv.dev/SDV User Guides Developer Guides Github

The Synthetic Data Vault Project 1.2k Jan 07, 2023
Time ranges with python

timeranges Time ranges. Read the Docs Installation pip timeranges is available on pip: pip install timeranges GitHub You can also install the latest v

Micael Jarniac 2 Sep 01, 2022
Show you how to integrate Zeppelin with Airflow

Introduction This repository is to show you how to integrate Zeppelin with Airflow. The philosophy behind the ingtegration is to make the transition f

Jeff Zhang 11 Dec 30, 2022
A Python Tools to imaging the shallow seismic structure

ShallowSeismicImaging Tools to imaging the shallow seismic structure, above 10 km, based on the ZH ratio measured from the ambient seismic noise, and

Xiao Xiao 9 Aug 09, 2022
Python package for analyzing sensor-collected human motion data

Python package for analyzing sensor-collected human motion data

Simon Ho 71 Nov 05, 2022
The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias Introduction | Updates | Usage | Results&Pretrained Models | Statement | Intr

104 Nov 27, 2022
PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j.

PostQF Copyright © 2022 Ralph Seichter PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j. See the ma

Ralph Seichter 11 Nov 24, 2022
Intake is a lightweight package for finding, investigating, loading and disseminating data.

Intake: A general interface for loading data Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps

Intake 851 Jan 01, 2023
A set of functions and analysis classes for solvation structure analysis

SolvationAnalysis The macroscopic behavior of a liquid is determined by its microscopic structure. For ionic systems, like batteries and many enzymes,

MDAnalysis 19 Nov 24, 2022
📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

Paulina Khew 21 Aug 30, 2022
Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming

Using Streaming Twitter Data with Kafka and Spark Reading streams of Twitter data, publishing them to Kafka topic, process message using Kafka Stream

Rustam Zokirov 1 Dec 06, 2021
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.

DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis. The main goal of the package is to accelerate the process of computing estimates of forward reachable sets for nonlinear dy

2 Nov 08, 2021
pyETT: Python library for Eleven VR Table Tennis data

pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation

Tharsis Souza 5 Nov 19, 2022
Transform-Invariant Non-Negative Matrix Factorization

Transform-Invariant Non-Negative Matrix Factorization A comprehensive Python package for Non-Negative Matrix Factorization (NMF) with a focus on learn

EMD Group 6 Jul 01, 2022