About Library for extract infomation from thai personal identity card.

Overview

ThaiPersonalCardExtract

Downloads PyPI Status license Instragram

Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract

New Feature v1.3.2 🎁

  • Increase performance.
  • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
  • Refactor Output Structure.
  • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ

Examples

Example image file.

Real image file Real image file Real image file

wrapPerpective image crop.

wrapPerpective image crop wrapPerpective image crop

keypoint of image detected.

keypoint of image detected

Resutls of library extract region of interest

Identification Number

FullNameTH

NameEN

LastNameEN

BirthdayTH

BirthdayEN

Religion

Address

DateOfIssueTH

DateOfIssueEN

DateOfExpiryTH

DateOfExpiryEN

Recommend

  • Image quality lowest should be 600x350
  • Images with minimal reflections should be used. for good results
  • Identity Card should be size in the image about 75%, if the image doesn't cropped that to be left only Identity Card area.
  • For faster, please resize image and usage CUDA GPU.

Installation

Install using pip for stable release,

pip install thai-personal-card-extract

For latest development release,

pip install git+git://github.com/ggafiled/ThaiPersonalCardExtrac.git

Note 1: for Windows, please install tesseract first by following the official instruction here https://medium.com/@navapat.tpb/734dae2fb4d3 On medium website, be sure to setup already.

Note 2: for Linux os, please install tesseract by following the official instruction https://github.com/tesseract-ocr/tesseract

Usage

# With build-in Config Options. 

import ThaiPersonalCardExtract as card
reader = card.PersonalCard(
    lang=card.THAI,
    provider=card.DEFAULT,
    tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract",
    save_extract_result=True,
    path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract")
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส PersonalCard เพื่อสกัดข้อมูลบัตรประจำตัวประชาชน 

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส DrivingLicense เพื่อสกัดข้อมูลใบอนุญาตขับขี่

from ThaiPersonalCardExtract import DrivingLicense
reader = DrivingLicense(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส ThaiGovernmentLottery เพื่อสกัดข้อมูลลอตเตอร์รี่

from ThaiPersonalCardExtract import ThaiGovernmentLottery
reader = ThaiGovernmentLottery(save_extract_result=True, path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract/thai_government_lottery") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo("../examples/card7.jpg")
print(result)

Output will be in list format, each item represents result of library can extract, respectively. type of namedtuple ผลลัพธ์ที่ได้จะเป็นประเภท namedtuple สามารถศึกษาเพิ่มเติมเพื่อใช้งานได้จากที่นี่ คลิก

#Output of PersonalCard
    Card(Identification_Number='9999999999999', FullNameTH='นาย อายุมฺมุราเสะ', PrefixTH='นาย', NameTH='อายุมฺมุราเสะ', LastNameTH='อายุมฺมุราเสะ', PrefixEN='.Mr.Shoyo', NameEN='', LastNameEN='Hinatao', BirthdayTH='21 มี.ย. 2539', BirthdayEN='21 Jun..1996', Religion='พุทธ', Address='ท8ปฺ` 99/1 มิซีโฮะ เขตฮานามิกาวา อำเภอชิบ', DateOfIssueTH='11 ส.ค. 2554', DateOfIssueEN='11 Ang. 2021', DateOfExpiryTH='11 ส.ค. 2574', DateOfExpiryEN='11 Aug. 2031,')

#Output of DrivingLicense
    Card(License_Number='98765432', IssueDateTH='ผังทาทม', ExpiryDateTH='', IssueDateEN='14 August 2664', ExpiryDateEN='14 August 2574', NameTH='า? โนบกะ โนบี', NameEN='MRONOREAUMANE', BirthDayTH='', BirthDayEN='wa hs OKRA', Identity_Number='', Province='นคาราชศีมา')

#Output of ThaiGovernmentLottery
    Lottery(LotteryNumber='424603', LessonNumber='08', SetNumber='23', Year='2564') #type namedtuple 
    
 สามารถเข้าถึงตัวแปรได้ตามรูปแบบนี้
 print(result.LotteryNumber)
 print(result.LessonNumber)
 print(result.SetNumber)
 print(result.Year)

For set lang attribute to tha

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Output will be in list format, each item represents result of library can extract, respectively.

{
   "Identification_Number": "9999999999999",
   "FullNameTH": "นาย อายุมฺมุราเสะ",
   "PrefixTH": "นาย",
   "NameTH": "อายุมฺมุราเสะ",
   "LastNameTH": "อายุมฺมุราเสะ",
   "BirthdayTH": "21 มี.ย. 2539",
   "Religion": "พุทธ",
   "Address": "ท๒ 99/1 มิชีโฮะ เขตฮานามิกาวา อำเภอชิบ;",
   "DateOfIssueTH": "11 ส.ค. 2554",
   "DateOfExpiryTH": "11 ส.ค. 2574"
}

And you can set ocr provider following below default #used both easyocr and tesseract **Recommend Or easyocr Or tesseract

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", provider="default", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Config Options

you can set options to Instance by below keyword

Parameter name Value Type Example
lang String Expected Results Language bash mix #get all area both tha and eng Or bash tha Or bash eng *Default is 'mix' สำหรับ DrivingLicense, PersonalCard
provider String OCR Provider have bash default #used both easyocr and tesseract **Recommend Or bash easyocr Or bash tesseract *Default is 'default' สำหรับ DrivingLicense, PersonalCard
template_threshold Double Rate to cals similarity of template *Default is 0.7
sift_rate Int Feature Keypoint rate *Default is 25,000
tesseract_cmd String Path of your tesseract command **For windows only.
save_extract_result Boolean Set True if you want to save extracted image *Default is False
path_to_save String Path that you given it save extracted image, relative with save_extract_result=True

Donate Me

promptpay

Mr.Nattapol Krobklang

You might also like...
extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTsite python Package extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install Get

A functional standard library for Python.

Toolz A set of utility functions for iterators, functions, and dictionaries. See the PyToolz documentation at https://toolz.readthedocs.io LICENSE New

🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

Retrying library for Python

Tenacity Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.
isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type. It provides a command line utility, Python library and plugins for various editors to quickly sort all your imports.

A Python library for reading, writing and visualizing the OMEGA Format
A Python library for reading, writing and visualizing the OMEGA Format

A Python library for reading, writing and visualizing the OMEGA Format, targeted towards storing reference and perception data in the automotive context on an object list basis with a focus on an urban use case.

RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.
pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.

pydsinternals - Directory Services Internals Library A Python native library containing necessary classes, functions and structures to interact with W

Comments
  • 'PersonalCard' object has no attribute 'extractInfo'

    'PersonalCard' object has no attribute 'extractInfo'


    AttributeError Traceback (most recent call last) /var/folders/cc/d15mjmhn5c5fscqfwq3fql5h0000gp/T/ipykernel_6920/1182933257.py in ----> 1 result = reader.extractInfo('ThaiPersonalCardExtract/examples/extract/image_scan.jpg') 2 print(result)

    AttributeError: 'PersonalCard' object has no attribute 'extractInfo'

    opened by suwika 0
Releases(v1.3.4)
  • v1.3.4(Sep 2, 2021)

    New Feature v1.3.4 🎁

    • Support Thai identity card laser code extract. (02 Sep. 2021)
    • Fix bug dataset folder not import thai_government_lottery resource. (23 Aug. 2021) #1
    • Increase performance.
    • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
    • Refactor Output Structure.
    • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Aug 16, 2021)

    New Feature v1.3.1 🎁 Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ Support Thai Government Lottery (16 Aug. 2021)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Aug 14, 2021)

    Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ ปรับเปลี่ยนรูปแบบไฟล์ระบบ

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Aug 13, 2021)

    New Feature 🎁

    • More arae extract.
    • lang : attribute : get only area of given language.
    • provider : attribute : set ocr provider now support easyocr and tesseract.
    Source code(tar.gz)
    Source code(zip)
  • v1.0-beta(Aug 11, 2021)

Owner
ggafiled
นานๆที อัพ ครับบบ.
ggafiled
✨ Un juste prix totalement fait en Python par moi, et en français.

Juste Prix ❗ Un juste prix totalement fait en Python par moi, et en français. 🔮 Avec l'utilisation du module "random", j'ai pu faire un choix aléatoi

MrGabin 3 Jun 06, 2021
A pythonic dependency injection library.

Pinject Pinject is a dependency injection library for python. The primary goal of Pinject is to help you assemble objects into graphs in an easy, main

Google 1.3k Dec 30, 2022
The Black shade analyser and comparison tool.

diff-shades The Black shade analyser and comparison tool. AKA Richard's personal take at a better black-primer (by stealing ideas from mypy-primer) :p

Richard Si 10 Apr 29, 2022
A quick random name generator

Random Profile Generator USAGE & CREDITS Any public or priavte demonstrative usage of this project is strictly prohibited, UNLESS WhineyMonkey10 (http

2 May 05, 2022
API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition.

VMI API API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition. Install docker-compose up -

Victor Vargas Sandoval 1 Oct 26, 2021
Casefy (/keɪsfaɪ/) is a lightweight Python package to convert the casing of strings

Casefy (/keɪsfaɪ/) is a lightweight Python package to convert the casing of strings. It has no third-party dependencies and supports Unicode.

Diego Miguel Lozano 12 Jan 08, 2023
🍰 ConnectMP - An easy and efficient way to share data between Processes in Python.

ConnectMP - Taking Multi-Process Data Sharing to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : 🍤 ConnectMP is the easiest and

Aiden Ellis 1 Dec 24, 2021
Retrying library for Python

Tenacity Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Julien Danjou 4.3k Jan 05, 2023
About Library for extract infomation from thai personal identity card.

ThaiPersonalCardExtract Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract New Feature v1.3.2 🎁 In

ggafiled 26 Nov 15, 2022
A small utility that sorts your files.

FileSorter A small utility that sorts your files. TODO: Scan directory to find files(thanks @corruptmemry for this!) Split extensions to determine fil

2 Jun 16, 2022
Entropy-controlled contexts in Python

Python module ordered ordered module is the opposite to random - it maintains order in the program. import random x = 5 def increase(): global x

HyperC 36 Nov 03, 2022
A sys-botbase client for remote control automation of Nintendo Switch consoles. Based on SysBot.NET, written in python.

SysBot.py A sys-botbase client for remote control automation of Nintendo Switch consoles. Based on SysBot.NET, written in python. Setup: Download the

7 Dec 16, 2022
jsoooooooon derulo - Make sure your 'jason derulo' is featured as the first part of your json data

jsonderulo Make sure your 'jason derulo' is featured as the first part of your json data Install: # python pip install jsonderulo poetry add jsonderul

jesse 3 Sep 13, 2021
This utility lets you draw using your laptop's touchpad on Linux.

FingerPaint This utility lets you draw using your laptop's touchpad on Linux. Pressing any key or clicking the touchpad will finish the drawing

Wazzaps 95 Dec 17, 2022
Dynamic key remapper for Wayland Window System, especially for Sway

wayremap Dynamic keyboard remapper for Wayland. It works on both X Window Manager and Wayland, but focused on Wayland as it intercepts evdev input and

Kay Gosho 50 Nov 29, 2022
Shypan, a simple, easy to use, full-featured library written in Python.

Shypan, a simple, easy to use, full-featured library written in Python.

ShypanLib 4 Dec 08, 2021
An OData v4 query parser and transpiler for Python

odata-query is a library that parses OData v4 filter strings, and can convert them to other forms such as Django Queries, SQLAlchemy Queries, or just plain SQL.

Gorilla 39 Jan 05, 2023
一款不需要买代理来减少扫网站目录被封概率的扫描器,适用于中小规格字典。

PoorScanner使用说明书 -工具在不同环境下可能不怎么稳定,如果有什么问题恳请大家反馈。说明书有什么错误的地方也大家欢迎指正。 更新记录 2021.8.23 修复了云函数主程序 gitee上传文件接口写错了的BUG(之前把自己的上传地址写死进去了,没从配置文件里读) 更新了说明书 PoorS

14 Aug 02, 2022
✨ Un pierre feuille ciseaux totalement fait en Python par moi, et en français.

Pierre Feuille Ciseaux ❗ Un pierre feuille ciseaux totalement fait en Python par moi. 🔮 Avec l'utilisation du module "random", j'ai pu faire un choix

MrGabin 3 Jun 06, 2021
Basic loader is a small tool that will help you generating Cloudflare cookies

Basic Loader Cloudflare cookies loader This tool may help some people getting valide cloudflare cookies Installation 🔌 : pip install -r requirements.

IHateTomLrge 8 Mar 30, 2022