Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Overview

AutoViz

banner

Pepy Downloads Pepy Downloads per week Pepy Downloads per month standard-readme compliant Python Versions PyPI Version PyPI License

Automatically Visualize any dataset, any size with a single line of code.

AutoViz performs automatic visualization of any dataset with one line. Give any input file (CSV, txt or json) and AutoViz will visualize it.

Table of Contents

Install

Prerequsites

To clone AutoViz, it's better to create a new environment, and install the required dependencies:

To install from PyPi:

conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
pip install autoviz

To install from source:

cd <AutoViz_Destination>
git clone [email protected]:AutoViML/AutoViz.git
# or download and unzip https://github.com/AutoViML/AutoViz/archive/master.zip
conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
cd AutoViz
pip install -r requirements.txt

Usage

Read this Medium article to know how to use AutoViz.

In the AutoViz directory, open a Jupyter Notebook and use this line to instantiate the library

from autoviz.AutoViz_Class import AutoViz_Class

AV = AutoViz_Class()

Load a dataset (any CSV or text file) into a Pandas dataframe or give the name of the path and filename you want to visualize. If you don't have a filename, you can simply assign the filename argument "" (empty string).

Call AutoViz using the filename (or dataframe) along with the separator and the name of the target variable in the input. AutoViz will do the rest. You will see charts and plots on your screen.

filename = ""
sep = ","
dft = AV.AutoViz(
    filename,
    sep=",",
    depVar="",
    dfte=None,
    header=0,
    verbose=0,
    lowess=False,
    chart_format="svg",
    max_rows_analyzed=150000,
    max_cols_analyzed=30,
)

AV.AutoViz is the main plotting function in AV.

Notes:

  • AutoViz will visualize any sized file using a statistically valid sample.
  • COMMA is assumed as default separator in file. But you can change it.
  • Assumes first row as header in file but you can change it.
  • verbose option
    • if 0, display minimal information but displays charts on your notebook
    • if 1, print extra information on the notebook and also display charts
    • if 2, will not display any charts, it will simply save them in your local machine under AutoViz_Plots directory

API

Arguments

  • filename - Make sure that you give filename as empty string ("") if there is no filename associated with this data and you want to use a dataframe, then use dfte to give the name of the dataframe. Otherwise, fill in the file name and leave dfte as empty string. Only one of these two is needed to load the data set.
  • sep - this is the separator in the file. It can be comma, semi-colon or tab or any value that you see in your file that separates each column.
  • depVar - target variable in your dataset. You can leave it as empty string if you don't have a target variable in your data.
  • dfte - this is the input dataframe in case you want to load a pandas dataframe to plot charts. In that case, leave filename as an empty string.
  • header - the row number of the header row in your file. If it is the first row, then this must be zero.
  • verbose - it has 3 acceptable values: 0, 1 or 2. With zero, you get all charts but limited info. With 1 you get all charts and more info. With 2, you will not see any charts but they will be quietly generated and save in your local current directory under the AutoViz_Plots directory which will be created. Make sure you delete this folder periodically, otherwise, you will have lots of charts saved here if you used verbose=2 option a lot.
  • lowess - this option is very nice for small datasets where you can see regression lines for each pair of continuous variable against the target variable. Don't use this for large data sets (that is over 100,000 rows)
  • chart_format - this can be SVG, PNG or JPG. You will get charts generated and saved in this format if you used verbose=2 option. Very useful for generating charts and using them later.
  • max_rows_analyzed - limits the max number of rows that is used to display charts. If you have a very large data set with millions of rows, then use this option to limit the amount of time it takes to generate charts. We will take a statistically valid sample.
  • max_cols_analyzed - limits the number of continuous vars that can be analyzed

Maintainers

Contributing

See the contributing file!

PRs accepted.

License

Apache License, Version 2.0

DISCLAIMER

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

Owner
AutoViz and Auto_ViML
Automated Machine Learning: Build Variant Interpretable Machine Learning models. Project Created by Ram Seshadri.
AutoViz and Auto_ViML
Material for dataviz course at university of Bordeaux

Material for dataviz course at university of Bordeaux

Nicolas P. Rougier 50 Jul 17, 2022
A curated list of awesome Dash (plotly) resources

Awesome Dash A curated list of awesome Dash (plotly) resources Dash is a productive Python framework for building web applications. Written on top of

Luke Singham 1.7k Dec 26, 2022
Simple python implementation with matplotlib to manually fit MIST isochrones to Gaia DR2 color-magnitude diagrams

Simple python implementation with matplotlib to manually fit MIST isochrones to Gaia DR2 color-magnitude diagrams

Karl Jaehnig 7 Oct 22, 2022
A napari plugin for visualising and interacting with electron cryotomograms.

napari-tomoslice A napari plugin for visualising and interacting with electron cryotomograms. Installation You can install napari-tomoslice via pip: p

3 Jan 03, 2023
Automatically visualize your pandas dataframe via a single print! 📊 💡

A Python API for Intelligent Visual Discovery Lux is a Python library that facilitate fast and easy data exploration by automating the visualization a

Lux 4.3k Dec 28, 2022
A minimalistic wrapper around PyOpenGL to save development time

glpy glpy is pyOpenGl wrapper which lets you work with pyOpenGl easily.It is not meant to be a replacement for pyOpenGl but runs on top of pyOpenGl to

Abhinav 9 Apr 02, 2022
Streaming pivot visualization via WebAssembly

Perspective is an interactive visualization component for large, real-time datasets. Originally developed for J.P. Morgan's trading business, Perspect

The Fintech Open Source Foundation (www.finos.org) 5.1k Dec 27, 2022
HW 2: Visualizing interesting datasets

HW 2: Visualizing interesting datasets Check out the project instructions here! Mean Earnings per Hour for Males and Females My first graph uses data

7 Oct 27, 2021
A filler visualizer built using python

filler-visualizer 42 filler のログをビジュアライズしてスポーツさながら楽しむことができます! Usage (標準入力でvisualizer.pyに渡せばALL OK) 1. 既にあるログをビジュアライズする $ ./filler_vm -t 3 -p1 john_fill

Takumi Hara 1 Nov 04, 2021
This is a sorting visualizer made with Tkinter.

Sorting-Visualizer This is a sorting visualizer made with Tkinter. Make sure you've installed tkinter in your system to use this visualizer pip instal

Vishal Choubey 7 Jul 06, 2022
Interactive chemical viewer for 2D structures of small molecules

👀 mols2grid mols2grid is an interactive chemical viewer for 2D structures of small molecules, based on RDKit. ➡️ Try the demo notebook on Google Cola

Cédric Bouysset 154 Dec 26, 2022
Generate the report for OCULTest.

Sample report generated in this function Usage example from utils.gen_report import generate_report if __name__ == '__main__': # def generate_rep

Philip Guo 1 Mar 10, 2022
Fast scatter density plots for Matplotlib

About Plotting millions of points can be slow. Real slow... 😴 So why not use density maps? ⚡ The mpl-scatter-density mini-package provides functional

Thomas Robitaille 473 Dec 12, 2022
paintable GitHub contribute table

githeart paintable github contribute table how to use: Functions key color select 1,2,3,4,5 clear c drawing mode mode on turn off e print paint matrix

Bahadır Araz 27 Nov 24, 2022
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews

hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?

HoloViz 694 Jan 04, 2023
Log visualizer for whirl-framework

Lumberjack Log visualizer for whirl-framework Установка pip install -r requirements.txt Как пользоваться python3 lumberjack.py -l путь до лога -o

Vladimir Malinovskii 2 Dec 19, 2022
These data visualizations were created for my introductory computer science course using Python

Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python

Sophia Huang 12 Oct 20, 2022
Automatization of BoxPlot graph usin Python MatPlotLib and Excel

BoxPlotGraphAutomation Automatization of BoxPlot graph usin Python / Excel. This file is an automation of BoxPlot-Graph using python graph library mat

EricAugustin 1 Feb 07, 2022
🎨 Python3 binding for `@AntV/G2Plot` Plotting Library .

PyG2Plot 🎨 Python3 binding for @AntV/G2Plot which an interactive and responsive charting library. Based on the grammar of graphics, you can easily ma

hustcc 990 Jan 05, 2023
Frbmclust - Clusterize FRB profiles using hierarchical clustering, plot corresponding parameters distributions

frbmclust Getting Started Clusterize FRB profiles using hierarchical clustering,

3 May 06, 2022