Interactive plotting for Pandas using Vega-Lite

Overview

pdvega: Vega-Lite plotting for Pandas Dataframes

build status Binder

pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas dataframes, using an API that is nearly identical to Pandas' built-in visualization tools, and designed for easy use within the Jupyter notebook.

Pandas currently has some basic plotting capabilities based on matplotlib. So, for example, you can create a scatter plot this way:

import numpy as np
import pandas as pd

df = pd.DataFrame({'x': np.random.randn(100), 'y': np.random.randn(100)})
df.plot.scatter(x='x', y='y')

matplotlib scatter output

The goal of pdvega is that any time you use dataframe.plot, you'll be able to replace it with dataframe.vgplot and instead get a similar (but prettier and more interactive) visualization output in Vega-Lite that you can easily export to share or customize:

import pdvega  # import adds vgplot attribute to pandas

df.vgplot.scatter(x='x', y='y')

vega-lite scatter output

The above image is a static screenshot of the interactive output; please see the Documentation for a full set of live usage examples.

Installation

You can get started with pdvega using pip:

$ pip install jupyter pdvega
$ jupyter nbextension install --sys-prefix --py vega3

The first line installs pdvega and its dependencies; the second installs the Jupyter extensions that allows plots to be displayed in the Jupyter notebook. For more information on installation and dependencies, see the Installation docs.

Why Vega-Lite?

When working with data, one of the biggest challenges is ensuring reproducibility of results. When you create a figure and export it to PNG or PDF, the data become baked-in to the rendering in a way that is difficult or impossible for others to extract. Vega and Vega-Lite change this: instead of packaging a figure by encoding its pixel values, they package a figure by describing, in a declarative manner, the relationship between data values and visual encodings through a JSON specification.

This means that the Vega-Lite figures produced by pdvega are portable: you can send someone the resulting JSON specification and they can choose whether to render it interactively online, convert it to a PNG or EPS for static publication, or even enhance and extend the figure to learn more about the data.

pdvega is a step in bringing this vision of figure portability and reproducibility to the Python world.

Relationship to Altair

Altair is a project that seeks to design an intuitive declarative API for generating Vega-Lite and Vega visualizations, using Pandas dataframes as data sources.

By contrast, pdvega seeks not to design new visualization APIs, but to use the existing DataFrame.plot visualization api and output visualizations with Vega/Vega-Lite rather than with matplotlib.

In this respect, pdvega is quite similar in spirit to the now-defunct mpld3 project, though the scope is smaller and (hopefully) much more manageable.

Owner
Altair
Declarative visualization in Python
Altair
A simple interpreted language for creating basic mathematical graphs.

graphr Introduction graphr is a small language written to create basic mathematical graphs. It is an interpreted language written in python and essent

2 Dec 26, 2021
Designed a greedy algorithm based on Markov sequential decision-making process in MATLAB/Python to optimize using Gurobi solver

Designed a greedy algorithm based on Markov sequential decision-making process in MATLAB/Python to optimize using Gurobi solver, the wheel size, gear shifting sequence by modeling drivetrain constrai

Sabbella Prasanna 1 Jan 11, 2022
Insert SVGs into matplotlib

Insert SVGs into matplotlib

Andrew White 35 Dec 29, 2022
A declarative (epi)genomics visualization library for Python

gos is a declarative (epi)genomics visualization library for Python. It is built on top of the Gosling JSON specification, providing a simplified interface for authoring interactive genomic visualiza

Gosling 107 Dec 14, 2022
Lightweight, extensible data validation library for Python

Cerberus Cerberus is a lightweight and extensible data validation library for Python. v = Validator({'name': {'type': 'string'}}) v.validate({

eve 2.9k Dec 27, 2022
Lumen provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

Lumen project provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

HoloViz 120 Jan 04, 2023
Python script to generate a visualization of various sorting algorithms, image or video.

sorting_algo_visualizer Python script to generate a visualization of various sorting algorithms, image or video.

146 Nov 12, 2022
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data

FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick

Synthesized 69 Dec 15, 2022
Pglive - Pglive package adds support for thread-safe live plotting to pyqtgraph

Live pyqtgraph plot Pglive package adds support for thread-safe live plotting to

Martin Domaracký 15 Dec 10, 2022
A Jupyter - Three.js bridge

pythreejs A Python / ThreeJS bridge utilizing the Jupyter widget infrastructure. Getting Started Installation Using pip: pip install pythreejs And the

Jupyter Widgets 844 Dec 27, 2022
An automatic prover for tautologies in Metamath

completeness An automatic prover for tautologies in Metamath This program implements the constructive proof of the Completeness Theorem for propositio

Scott Fenton 2 Dec 15, 2021
Some problems of SSLC ( High School ) before outputs and after outputs

Some problems of SSLC ( High School ) before outputs and after outputs 1] A Python program and its output (output1) while running the program is given

Fayas Noushad 3 Dec 01, 2021
A little logger for machine learning research

Blinker Blinker provides a fast dispatching system that allows any number of interested parties to subscribe to events, or "signals". Signal receivers

Reinforcement Learning Working Group 27 Dec 03, 2022
Visualizations of some specific solutions of different differential equations.

Diff_sims Visualizations of some specific solutions of different differential equations. Heat Equation in 1 Dimension (A very beautiful and elegant ex

2 Jan 13, 2022
PyPassword is a simple follow up to PyPassphrase

PyPassword PyPassword is a simple follow up to PyPassphrase. After finishing that project it occured to me that while some may wish to use that option

Scotty 2 Jan 22, 2022
Render Jupyter notebook in the terminal

jut - JUpyter notebook Terminal viewer. The command line tool view the IPython/Jupyter notebook in the terminal. Install pip install jut Usage $jut --

Kracekumar 169 Dec 27, 2022
Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 04, 2023
A Python-based non-fungible token (NFT) generator built using Samilla and Matplotlib

PyNFT A Pythonic NF (non-fungible token) generator built using Samilla and Matplotlib Use python pynft.py [amount] The intention behind this generato

Ayush Gundawar 6 Feb 07, 2022
Collection of scripts for making high quality beautiful math-related posters.

Poster Collection of scripts for making high quality beautiful math-related posters. The poster can have as large printing size as 3x2 square feet wit

Nattawut Phetmak 3 Jun 09, 2022
HW 2: Visualizing interesting datasets

HW 2: Visualizing interesting datasets Check out the project instructions here! Mean Earnings per Hour for Males and Females My first graph uses data

7 Oct 27, 2021