Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.

Overview

Data Inspector

Author MIT Contributions welcome Stars Downloads

Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier.

Author: Kazi Amit Hasan

Project Description:

Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

Latest Added Feature:

Added regplots in the library

Installation:

pip install data-inspector

Package available at https://pypi.org/project/data-inspector/

Available automation:

  1. Line plot : line_plot(data, x_data, y_data, x_label="", y_label="", title="")
  2. Skew feature: plot_skewed_feature(data, column)
  3. Showing data distribution: show_distribution(data, column)
  4. Scatter plot: plot_scatter(data,x_data, y_data)
  5. Correlation plot: plot_correlation(data)
  6. Create histogram: histogram(data,column, x_label, y_label, title)
  7. Create bar plot: plot_bar(data, column, xlabel, ylabel, title)
  8. Create boxplots of all features: box_plot(data)
  9. Checking dataset's shape: datasetShape(data)
  10. Get dataset's diagnostic plots: diagnostic_plots(data, variable)
  11. Divide numerical and categorical features: divideFeatures(data)
  12. Fill NaN values: fillNan(data, column, value)
  13. Get pearson's correlation between two variables: get_correlation(column_1, column_2, data)
  14. Plotting kde plots: plot_cont_kde(data, var)
  15. Automatic calculating the missing values and their percentage along with visualization : calculating_missing_values(data)
  16. Regression plot with 95% CI : plot_regplot(data,x_data, y_data)

Tutorial:

Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing

Some visualizations:



How to cite:

@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21}, 
publisher={Kazi Amit Hasan}
}

Future Works:

  1. Add some automations for time series data.

How to contribute:

Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.

You might also like...
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects

CLI tool to measure the build time of different, free configurable Sphinx-Projec

A module filled with many useful functions and modules in various subjects.
A module filled with many useful functions and modules in various subjects.

Usefulpy Check out the Usefulpy site Usefulpy site is not always up to date Download and Import download and install with with pip download usefulpyth

Template repo to quickly make a tested and documented GitHub action in Python with Poetry

Python + Poetry GitHub Action Template Getting started from the template Rename the src/action_python_poetry package. Globally replace instances of ac

Make posters from Markdown files.
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

A tutorial for people to run synthetic data replica's from source healthcare datasets
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

Source Code for 'Practical Python Projects' (video) by Sunil Gupta
Source Code for 'Practical Python Projects' (video) by Sunil Gupta

Apress Source Code This repository accompanies %Practical Python Projects by Sunil Gupta (Apress, 2021). Download the files as a zip using the green b

Automatically open a pull request for repositories that have no CONTRIBUTING.md file

automatic-contrib-prs Automatically open a pull request for repositories that have no CONTRIBUTING.md file for a targeted set of repositories. What th

The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Releases(eda)
  • eda(Aug 19, 2021)

    Data Inspector brings a total of 15 essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

    PYPI link: https://pypi.org/project/data-inspector/

    Source code(tar.gz)
    Source code(zip)
Owner
Kazi Amit Hasan
ML Engineer at ACI Limited | Kaggle Competition Expert (x4) | Researcher
Kazi Amit Hasan
Easy OpenAPI specs and Swagger UI for your Flask API

Flasgger Easy Swagger UI for your Flask API Flasgger is a Flask extension to extract OpenAPI-Specification from all Flask views registered in your API

Flasgger 3.1k Jan 05, 2023
Paper and Code for "Curriculum Learning by Optimizing Learning Dynamics" (AISTATS 2021)

Curriculum Learning by Optimizing Learning Dynamics (DoCL) AISTATS 2021 paper: Title: Curriculum Learning by Optimizing Learning Dynamics [pdf] [appen

Tianyi Zhou 15 Dec 06, 2022
[Unofficial] Python PEP in EPUB format

PEPs in EPUB format This is a unofficial repository where I stock all valid PEPs in the EPUB format. Repository Cloning git clone --recursive Mickaël Schoentgen 9 Oct 12, 2022

Bring RGB to life in Neovim

Bring RGB to life in Neovim Change your RGB devices' color depending on Neovim's mode. Fast and asynchronous plugin to live your vim-life to the fulle

Antoine 40 Oct 27, 2022
charcade is a string manipulation library that can animate, color, and bruteforce strings

charcade charcade is a string manipulation library that can animate, color, and bruteforce strings. Features Animating text for CLI applications with

Aaron 8 May 23, 2022
A simple tutorial to get you started with Discord and it's Python API

Hello there Feel free to fork and star, open issues if there are typos or you have a doubt. I decided to make this post because as a newbie I never fo

Sachit 1 Nov 01, 2021
Gaphor is the simple modeling tool

Gaphor Gaphor is a UML and SysML modeling application written in Python. It is designed to be easy to use, while still being powerful. Gaphor implemen

Gaphor 1.3k Jan 03, 2023
OpenAPI Spec validator

OpenAPI Spec validator About OpenAPI Spec Validator is a Python library that validates OpenAPI Specs against the OpenAPI 2.0 (aka Swagger) and OpenAPI

A 241 Jan 05, 2023
Gtech μLearn Sample_bot

Ser_bot Gtech μLearn Sample_bot Do Greet a newly joined member in a channel (random message) While adding a reaction to a message send a message to a

Jerin Paul 1 Jan 19, 2022
Main repository for the Sphinx documentation builder

Sphinx Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of mul

5.1k Jan 02, 2023
Pyoccur - Python package to operate on occurrences (duplicates) of elements in lists

pyoccur Python Occurrence Operations on Lists About Package A simple python package with 3 functions has_dup() get_dup() remove_dup() Currently the du

Ahamed Musthafa 6 Jan 07, 2023
This is a small project written to help build documentation for projects in less time.

Documentation-Builder This is a small project written to help build documentation for projects in less time. About This project builds documentation f

Tom Jebbo 2 Jan 17, 2022
Some custom tweaks to the results produced by pytkdocs.

pytkdocs_tweaks Some custom tweaks for pytkdocs. For use as part of the documentation-generation-for-Python stack that comprises mkdocs, mkdocs-materi

Patrick Kidger 4 Nov 24, 2022
Highlight Translator can help you translate the words quickly and accurately.

Highlight Translator can help you translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT

Coolshan 48 Dec 21, 2022
🌱 Complete API wrapper of Seedr.cc

Python API Wrapper of Seedr.cc Table of Contents Installation How I got the API endpoints? Start Guide Getting Token Logging with Username and Passwor

Hemanta Pokharel 43 Dec 26, 2022
Quilt is a self-organizing data hub for S3

Quilt is a self-organizing data hub Python Quick start, tutorials If you have Python and an S3 bucket, you're ready to create versioned datasets with

Quilt Data 1.2k Dec 30, 2022
Comprehensive Python Cheatsheet

Comprehensive Python Cheatsheet Download text file, Buy PDF, Fork me on GitHub or Check out FAQ. Contents 1. Collections: List, Dictionary, Set, Tuple

Jefferson 1 Jan 23, 2022
Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021

ReasonBERT Code and pre-trained models for ReasonBert: Pre-trained to Reason with Distant Supervision, EMNLP'2021 Pretrained Models The pretrained mod

SunLab-OSU 29 Dec 19, 2022
NetBox plugin for BGP related objects documentation

Netbox BGP Plugin Netbox plugin for BGP related objects documentation. Compatibility This plugin in compatible with NetBox 2.10 and later. Installatio

Nikolay Yuzefovich 133 Dec 27, 2022
Course materials and handouts for #100DaysOfCode in Python course

#100DaysOfCode with Python course Course details page: talkpython.fm/100days Course Summary #100DaysOfCode in Python is your perfect companion to take

Talk Python 1.9k Dec 31, 2022