This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

Overview

📈 Statistical Quality Control 📉

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

What is Statistical Quality Control?

  • statistical quality control is the use of statistical methods in the monitoring and maintaining of the quality of products and services. One method, referred to as acceptance sampling, can be used when a decision must be made to accept or reject a group of parts or items based on the quality found in a sample

  • Statistical quality control can be simply defined as an economic & effective system of maintaining & improving the quality of outputs throughout the whole operating process of specification, production & inspection based on continuous testing with random samples.

Why Statistical Quality Control?, what makes it important?

  • Statistical quality control techniques are extremely important for operating the estimable variations embedded in almost all manufacturing processes. Such variations arise due to raw material, consistency of product elements, processing machines, techniques deployed and packaging applications

  • SQC serves as a medium allowing manufacturers to attain maximum benefits by following controlled testing of manufactured products. Using this procedure, a manufacturing team can investigate the range of products with certain values that can be expected to reside under some existing conditions.

This statistical Quality Control can be easily implemented in python in few lines of code and graph can be beautifully visualized and analysed using matplotlib library.

For example lets consider a real life problem statement given like this:

  • A quality control inspector at the Cocoa Fizz soft drink company has taken ten samples with four observations each of the volume of bottles filled. The data and the computed means are shown in the table, use this information to develop control limits of three standard deviations for the bottling operation.

Data can be taken taken into an excel sheet like this:

After appending the data into excel sheet just hit run, statistical calculation will be done and you're greeted with this two graphs one is X-chat and the other one is R-chart.The x-bar and R-chart are quality control charts used to monitor the mean and variation of a process based on samples taken in a given time.X-bar chart: The mean or average change in process over time from subgroup values. The control limits on the X-Bar brings the sample’s mean and center into consideration.R-chart: The range of the process over the time from subgroups values. This monitors the spread of the process over the time.

Depending upon Data Graphs look like this:

(x-bar control chart)

(r-bar control chart)

From the both X bar and R charts it is clearly evident that the process is almost stable. If by chance the process is unstable that is there are many point in the outer region of quality control you make the process stable by changing the control limits,After the process stabilized, still if any point going out of control limits, it indicates an assignable cause exists in the process that needs to be addressed. This is an ongoing process to monitor the process performance.

Note:

  • Update data in excel before running the script, any number of rown and coloumns can be given.
  • Import used in this project are:
import pandas as pd 
import statistics
from statistics import mean,pstdev
import matplotlib.pyplot as plt
import numpy as np

make sure to install them before hand.

  • Code and logic is xplained in jupyter note book , do check that out
  • If you're interested more on this topic u can refer this PDF

Peace ✌️ .

Owner
SasiVatsal
open source enthusiast.🧑🏼‍💻 Just a teen interest in unix/linux 💻,android📱platforms, intermediate in python, js, c/c++.
SasiVatsal
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8k Dec 29, 2022
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
Statistical package in Python based on Pandas

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. Some of its main features are listed below. F

Raphael Vallat 1.2k Dec 31, 2022
PATC: Introduction to Big Data Analytics. Practical Data Analytics for Solving Real World Problems

PATC: Introduction to Big Data Analytics. Practical Data Analytics for Solving Real World Problems

1 Feb 07, 2022
Data exploration done quick.

Pandas Tab Implementation of Stata's tabulate command in Pandas for extremely easy to type one-way and two-way tabulations. Support: Python 3.7 and 3.

W.D. 20 Aug 27, 2022
A variant of LinUCB bandit algorithm with local differential privacy guarantee

Contents LDP LinUCB Description Model Architecture Dataset Environment Requirements Script Description Script and Sample Code Script Parameters Launch

Weiran Huang 4 Oct 25, 2022
Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

Find relative paths from a project root directory Finding project directories in Python (data science) projects, just like there R here and rprojroot

Daniel Chen 102 Nov 16, 2022
Performance analysis of predictive (alpha) stock factors

Alphalens Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open sour

Quantopian, Inc. 2.5k Jan 09, 2023
This repository contains some analysis of possible nerdle answers

Nerdle Analysis https://nerdlegame.com/ This repository contains some analysis of possible nerdle answers. Here's a quick overview: nerdle.py contains

0 Dec 16, 2022
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

Amazon Web Services - Labs 3.3k Jan 04, 2023
Falcon: Interactive Visual Analysis for Big Data

Falcon: Interactive Visual Analysis for Big Data Crossfilter millions of records without latencies. This project is work in progress and not documente

Vega 803 Dec 27, 2022
Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List.

tldextract Python Module tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and s

John Kurkowski 1.6k Jan 03, 2023
Vectorizers for a range of different data types

Vectorizers for a range of different data types

Tutte Institute for Mathematics and Computing 69 Dec 29, 2022
pyETT: Python library for Eleven VR Table Tennis data

pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation

Tharsis Souza 5 Nov 19, 2022
Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep

Sebastian Schäfer 10 Dec 08, 2022
PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j.

PostQF Copyright © 2022 Ralph Seichter PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j. See the ma

Ralph Seichter 11 Nov 24, 2022
Binance Kline Data With Python

Binance Kline Data by seunghan(gingerthorp) reference https://github.com/binance/binance-public-data/ All intervals are supported: 1m, 3m, 5m, 15m, 30

shquant 5 Jul 13, 2022
Analyzing Covid-19 Outbreaks in Ontario

My group and I took Covid-19 outbreak statistics from ontario, and analyzed them to find different patterns and future predictions for the virus

Vishwaajeeth Kamalakkannan 0 Jan 20, 2022
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

PandaPy "I came across PandaPy last week and have already used it in my current project. It is a fascinating Python library with a lot of potential to

Derek Snow 527 Jan 02, 2023
Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma

Avnika Mehta 1 Jan 27, 2022