Complete portable pipeline for masking of Aadhaar Number adhering to Govt. Privacy Guidelines.

Overview

Aadhaar Number Masking Pipeline

Implementation of a complete pipeline that masks the Aadhaar Number in given images to adhere to Govt. of India's Privacy Guidelines for storage of Aadhaar Card images in digital repository. The following project was carried out as an internship for Muthoot Finance. We make use of the open source packages CRAFT text detector | Paper | Pretrained Model | Github Repo provided by Clova AI Research for OSD and combine a heurestic model with pytesseract OCR for masking.

Rohit Ranjan, Ram Sundaram.

Sample Results

teaser

Versions

The search for the best masking pipeline led us to experiment with several different approaches. We have documented our experiments in other branches.

Branch(->model) Speed/Performance Pipeline
main Best performing CRAFT + pytesseract + dimensional heuristics
CNN_OCR->cnn_model Fastest masking CRAFT + LeNet trained by us
CNN_OCR->cnn_model_2 Fastest masking CRAFT + LeNet trained by us
UNET_OCDR Theoretically Fastest but trained model unavailable** UNet

**We proposed and implemented a pipeline which uses a single UNet model for achieving a desirable mask. A single model would have made the inference very fast and real time use capable on mobile devices. Training meant creating a dataset since the company could not legally provide us the needed data. After several trials, we halted work on this model because with barely 150 unique datapoints available, a data hungry UNet Model is simply unsatiable for now.

Datasets

Lenet was trained on our self-created labelled dataset | labels.

Getting started

Install dependencies

Requirements

  • torch
  • opencv-python
  • tesseract-ocr
  • check requirements.txt
pip install -r requirements.txt

Test instructions

  • Clone this repository
git clone https://github.com/thefurorjuror/Aadhaar_Masker.git
  • Run on an image folder
python [folder path to the cloned repo]/masker.py --test_folder=[folder path to test images] --output_folder=[folder path to output images] --cuda=[True/False]
#Example- When one is inside the cloned repo
python masker.py --test_folder=./images/ --output_folder=./output/ --cuda=True

cuda is set to False by default.

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

License

Copyright (c) 2021-present Rohit Ranjan & Ram Sundaram.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
A simple Python app to provide RPC for iTunes and the Music app. MacOS exclusive.

Ongaku You know, ongaku. A port of Ongaku to Python. Why? I don't know. A simple application providing the now playing state from iTunes (or the Music

Deltaion Lee 4 Oct 22, 2022
A discord bot that can detect Nitro Scam Links and delete them to protect other users

A discord bot that can detect Nitro Scam Links and delete them to protect other users. Add it to your server from here.

Kanak Mittal 9 Oct 20, 2022
Currency And Gold Prices - Currency And Gold Prices For Python

Currency_And_Gold_Prices Photos from the app New Update Show range Change better

Ali HemmatNia 4 Sep 19, 2022
Python Wrapper for handling payment requests through the Daraja MPESA API

Python Daraja Description Python Wrapper for handling payment requests through the Daraja MPESA API Contribution Refer to the CONTRIBUTING GUIDE. Usag

William Otieno 18 Dec 14, 2022
iCloudPy is a simple iCloud webservices wrapper library written in Python

iCloudPy 🤟 Please star this repository if you end up using the library. It will help me continue supporting this product. 🙏 iCloudPy is a simple iCl

Mandar Patil 49 Dec 26, 2022
Unofficial GoPro API Library for Python - connect to GoPro via WiFi.

GoPro API for Python Unofficial GoPro API Library for Python - connect to GoPro cameras via WiFi. Compatibility: HERO3 HERO3+ HERO4 (including HERO Se

Konrad Iturbe 1.3k Jan 01, 2023
A little proxy tool based on Tencent Cloud Function Service.

SCFProxy 一个基于腾讯云函数服务的免费代理池。 安装 python3 -m venv .venv source .venv/bin/activate pip3 install -r requirements.txt 项目配置 函数配置 开通腾讯云函数服务 在 函数服务 新建 中使用自定义

Mio 716 Dec 26, 2022
Translates English into Mandalorian (Mando'a) utilizing a "funtranslations" free API

Mandalorian Translator Translates English into Mandalorian (Mando'a) utilizing a "funtranslations" free API About I created this app to experiment wit

Hayden Covington 1 Dec 04, 2021
aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in a future time.

aws-lambda-scheduler aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in the future. This functionality is achieved by dyn

Oğuzhan Yılmaz 57 Dec 17, 2022
An advanced Filter Bot with nearly unlimitted filters!

Unlimited Filter Bot ㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤ An advanced Filter Bot with nearly unlimitted filters! Features Nearly unlimited filters Supports all type of fil

1 Nov 20, 2021
📈 A Discord bot for displaying the download stats of a repository made with Python, the Hikari API and PostgreSQL.

📈 axyl-stats axyl-stats is a Discord bot made with Python (with the Hikari API wrapper) and PostgreSQL, used as a download counter for a GitHub repo.

Angelo-F 2 May 14, 2022
NFT Generator - A NFT Generator created using Python

NFT_Generator v1 An NFT Generator created using Python. This NFT Generation tool

3 Dec 02, 2022
The best (and now open source) Discord selfbot.

React Selfbot Yes, for real Why am I making this open source? Because can't stop calling my product a rat, tokenlogger and what else not. But there is

30 Nov 13, 2022
gnosis safe tx builder

Ape Safe: Gnosis Safe tx builder Ape Safe allows you to iteratively build complex multi-step Gnosis Safe transactions and safely preview their side ef

228 Dec 22, 2022
Info & tools for reverse engineering the M6 smart fitness band

m6-reveng This repo contains information and tools for reverse engineering the $7 M6 smart fitness band. Hardware The SoC (system-on-a-chip) is a Teli

41 Dec 26, 2022
Aula-API - a school system widely used in Denmark, as you can see and read about in the python file

Information : Hello, thank you for reading this first of all. This is a Aula-API

Binary.club 2 May 28, 2022
Simple script to ban bots at Twitch chats using a text file as a source.

AUTOBAN 🇺🇸 English version Simple script to ban bots at Twitch chats using a text file as a source. How to use Windows Go to releases for further in

And Paiva 5 Feb 06, 2022
Please Do Not Throw Sausage Pizza Away - Side Scrolling Up The OSI Stack

Please Do Not Throw Sausage Pizza Away - Side Scrolling Up The OSI Stack

John Capobianco 2 Jan 25, 2022
企业微信消息推送的python封装接口,让你轻松用python实现对企业微信的消息推送

👋 corpwechat-bot是一个python封装的企业机器人&应用消息推送库,通过企业微信提供的api实现。 利用本库,你可以轻松地实现从服务器端发送一条文本、图片、视频、markdown等等消息到你的微信手机端,而不依赖于其他的第三方应用,如ServerChan。 如果喜欢该项目,记得给个

Chaopeng 161 Jan 06, 2023