Resources for teaching & learning practical data visualization with python.

Overview

Practical Data Visualization with Python

Overview

All views expressed on this site are my own and do not represent the opinions of any entity with which I have been, am now, or will be affiliated.

This repository contains all materials related to a lecture / seminar I teach on practical data visualization with python. What I mean by "practical" is that the materials herein do not focus on one particular library or data visualization method; rather, my goal is to empower the consumer of this content with the tools, heuristics, and methods needed to handle a wide variety of data visualization problems.

If you have questions, comments, or suggested alterations to these materials, please open an issue here on GitHub. Also, don't hesitate to reach out via LinkedIn.

Outline of Materials

Below you'll find a brief outline of the content contained in the four sections of this seminar, along with notebook links, and an example visualization from each section. For each section there is a separate notebook of python code containing all the materials for that section. Each notebook will start with a few setup steps--package imports and data prep mostly--that are almost identical between the notebooks, directly after which comes the content for each section. For information about the data used in these materials, check out the data_prep_nb.ipynb notebook, the easy-to-view version of which is hosted here.

Section 1: Why We Visualize

Here is the link to the easy-to-view notebook for this section of material.
Here is the link to the GitHub-hosted notebook for this section of the material.

  1. The power of visual data representation and storytelling.
  2. A few principles and heuristics of visualization.
  3. The building blocks of visualization explored.

Example Visualization from this Section:

Section 2: Overview of Python Visualization Landscape

Here is the link to the easy-to-view notebook for this section of material.
Here is the link to the GitHub-hosted notebook for this section of the material.

  1. Intro to the visualization ecosystem: python's Tower of Babel.
  2. Smorgasbord of packages explored through a single example viz.
  3. Quick & dirty (and subjective) heuristics for picking a visualization package.

Example Visualization from this Section:

Section 3: Statistical Visualization in the Wild

Here is the link to the easy-to-view notebook for this section of material.
Here is the link to the GitHub-hosted notebook for this section of the material.

  1. Example business use case of data visualization:
    1. Observational:
      • mean, median, and variance
      • distributions
    2. Inferential:
      • parametric tests
      • non-parametric tests

Example Visualization from this Section:

Section 4: Library Deep-Dive (Plotly)

Here is the link to the easy-to-view notebook for this section of material.
Here is the link to the GitHub-hosted notebook for this section of the material.

  1. Quick and simple data visualizations with Plotly Express.
  2. Additional control and complexity with base Plotly.

Example Visualization from this Section:

Homework Exercises

There is a homework associated with these materials, for those interested. Given the open-ended nature of the homework, there is no answer key. That said, if you're working through it and would like some feedback, feel free to reach out to me via LinkedIn.

Here is the link to the easy-to-view homework notebook.
Here is the link to the GitHub-hosted version of the homework notebook.

Setup Instructions

  • clone this repository
  • create a virtual environment using python3 -m venv env
  • activate that virtual environment using source env/bin/activate
  • install needed packages using pip install -r requirements.txt
  • run an instance of jupyter lab out of your virutal env using env/bin/jupyter-lab
  • open and run the four main files of content for this course--one for each section:
    • part_1_main_nb.ipynb
    • part_2_main_nb.ipynb
    • part_3_main_nb.ipynb
    • part_4_main_nb.ipynb
Owner
Paul Jeffries
Trained in intl. econ; started in mortgage finance; dabbled in equities & crypto; now working in banking. I enjoy challenging questions regarding value & risk.
Paul Jeffries
PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

PolytopeSampler PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disrib

9 Sep 26, 2022
Python package to Create, Read, Write, Edit, and Visualize GSFLOW models

pygsflow pyGSFLOW is a python package to Create, Read, Write, Edit, and Visualize GSFLOW models API Documentation pyGSFLOW API documentation can be fo

pyGSFLOW 21 Dec 14, 2022
demir.ai Dataset Operations

demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le

Ahmet Furkan DEMIR 8 Nov 01, 2022
Color maps for POV-Ray v3.7 from the Plasma, Inferno, Magma and Viridis color maps in Python's Matplotlib

POV-Ray-color-maps Color maps for POV-Ray v3.7 from the Plasma, Inferno, Magma and Viridis color maps in Python's Matplotlib. The include file Color_M

Tor Olav Kristensen 1 Apr 05, 2022
The open-source tool for building high-quality datasets and computer vision models

The open-source tool for building high-quality datasets and computer vision models. Website • Docs • Try it Now • Tutorials • Examples • Blog • Commun

Voxel51 2.4k Jan 07, 2023
Standardized plots and visualizations in Python

Standardized plots and visualizations in Python pltviz is a Python package for standardized visualization. Routine and novel plotting approaches are f

Andrew Tavis McAllister 0 Jul 09, 2022
Import, visualize, and analyze SpiderFoot OSINT data in Neo4j, a graph database

SpiderFoot Neo4j Tools Import, visualize, and analyze SpiderFoot OSINT data in Neo4j, a graph database Step 1: Installation NOTE: This installs the sf

Black Lantern Security 42 Dec 26, 2022
By default, networkx has problems with drawing self-loops in graphs.

By default, networkx has problems with drawing self-loops in graphs. It makes it hard to draw a graph with self-loops or to make a nicely looking chord diagram. This repository provides some code to

Vladimir Shitov 5 Jan 06, 2022
An open-source tool for visual and modular block programing in python

PyFlow PyFlow is an open-source tool for modular visual programing in python ! Although for now the tool is in Beta and features are coming in bit by

1.1k Jan 06, 2023
Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly

Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly Problem: 2 peloton users were looking for a way to track their metri

9 Jul 22, 2022
CPG represent!

CoolPandasGroup CPG represent! Arianna Brandon Enne Luan Tracie Project requirements: use Pandas to clean and format datasets use Jupyter Notebook to

Enne 3 Feb 07, 2022
Make visual music sheets for thatskygame (graphical representations of the Sky keyboard)

sky-python-music-sheet-maker This program lets you make visual music sheets for Sky: Children of the Light. It will ask you a few questions, and does

21 Aug 26, 2022
Mattia Ficarelli 2 Mar 29, 2022
Farhad Davaripour, Ph.D. 1 Jan 05, 2022
Automatic data visualization in atom with the nteract data-explorer

Data Explorer Interactively explore your data directly in atom with hydrogen! The nteract data-explorer provides automatic data visualization, so you

Ben Russert 65 Dec 01, 2022
Automatically visualize your pandas dataframe via a single print! 📊 💡

A Python API for Intelligent Visual Discovery Lux is a Python library that facilitate fast and easy data exploration by automating the visualization a

Lux 4.3k Dec 28, 2022
Seismic Waveform Inversion Toolbox-1.0

Seismic Waveform Inversion Toolbox (SWIT-1.0)

Haipeng Li 98 Dec 29, 2022
Python implementation of the Density Line Chart by Moritz & Fisher.

PyDLC - Density Line Charts with Python Python implementation of the Density Line Chart (Moritz & Fisher, 2018) to visualize large collections of time

Charles L. Bérubé 10 Jan 06, 2023
Log visualizer for whirl-framework

Lumberjack Log visualizer for whirl-framework Установка pip install -r requirements.txt Как пользоваться python3 lumberjack.py -l путь до лога -o

Vladimir Malinovskii 2 Dec 19, 2022
Data Visualization Guide for Presentations, Reports, and Dashboards

This is a highly practical and example-based guide on visually representing data in reports and dashboards.

Anton Zhiyanov 395 Dec 29, 2022