About Library for extract infomation from thai personal identity card.

Overview

ThaiPersonalCardExtract

Downloads PyPI Status license Instragram

Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract

New Feature v1.3.2 🎁

  • Increase performance.
  • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
  • Refactor Output Structure.
  • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ

Examples

Example image file.

Real image file Real image file Real image file

wrapPerpective image crop.

wrapPerpective image crop wrapPerpective image crop

keypoint of image detected.

keypoint of image detected

Resutls of library extract region of interest

Identification Number

FullNameTH

NameEN

LastNameEN

BirthdayTH

BirthdayEN

Religion

Address

DateOfIssueTH

DateOfIssueEN

DateOfExpiryTH

DateOfExpiryEN

Recommend

  • Image quality lowest should be 600x350
  • Images with minimal reflections should be used. for good results
  • Identity Card should be size in the image about 75%, if the image doesn't cropped that to be left only Identity Card area.
  • For faster, please resize image and usage CUDA GPU.

Installation

Install using pip for stable release,

pip install thai-personal-card-extract

For latest development release,

pip install git+git://github.com/ggafiled/ThaiPersonalCardExtrac.git

Note 1: for Windows, please install tesseract first by following the official instruction here https://medium.com/@navapat.tpb/734dae2fb4d3 On medium website, be sure to setup already.

Note 2: for Linux os, please install tesseract by following the official instruction https://github.com/tesseract-ocr/tesseract

Usage

# With build-in Config Options. 

import ThaiPersonalCardExtract as card
reader = card.PersonalCard(
    lang=card.THAI,
    provider=card.DEFAULT,
    tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract",
    save_extract_result=True,
    path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract")
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส PersonalCard เพื่อสกัดข้อมูลบัตรประจำตัวประชาชน 

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส DrivingLicense เพื่อสกัดข้อมูลใบอนุญาตขับขี่

from ThaiPersonalCardExtract import DrivingLicense
reader = DrivingLicense(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส ThaiGovernmentLottery เพื่อสกัดข้อมูลลอตเตอร์รี่

from ThaiPersonalCardExtract import ThaiGovernmentLottery
reader = ThaiGovernmentLottery(save_extract_result=True, path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract/thai_government_lottery") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo("../examples/card7.jpg")
print(result)

Output will be in list format, each item represents result of library can extract, respectively. type of namedtuple ผลลัพธ์ที่ได้จะเป็นประเภท namedtuple สามารถศึกษาเพิ่มเติมเพื่อใช้งานได้จากที่นี่ คลิก

#Output of PersonalCard
    Card(Identification_Number='9999999999999', FullNameTH='นาย อายุมฺมุราเสะ', PrefixTH='นาย', NameTH='อายุมฺมุราเสะ', LastNameTH='อายุมฺมุราเสะ', PrefixEN='.Mr.Shoyo', NameEN='', LastNameEN='Hinatao', BirthdayTH='21 มี.ย. 2539', BirthdayEN='21 Jun..1996', Religion='พุทธ', Address='ท8ปฺ` 99/1 มิซีโฮะ เขตฮานามิกาวา อำเภอชิบ', DateOfIssueTH='11 ส.ค. 2554', DateOfIssueEN='11 Ang. 2021', DateOfExpiryTH='11 ส.ค. 2574', DateOfExpiryEN='11 Aug. 2031,')

#Output of DrivingLicense
    Card(License_Number='98765432', IssueDateTH='ผังทาทม', ExpiryDateTH='', IssueDateEN='14 August 2664', ExpiryDateEN='14 August 2574', NameTH='า? โนบกะ โนบี', NameEN='MRONOREAUMANE', BirthDayTH='', BirthDayEN='wa hs OKRA', Identity_Number='', Province='นคาราชศีมา')

#Output of ThaiGovernmentLottery
    Lottery(LotteryNumber='424603', LessonNumber='08', SetNumber='23', Year='2564') #type namedtuple 
    
 สามารถเข้าถึงตัวแปรได้ตามรูปแบบนี้
 print(result.LotteryNumber)
 print(result.LessonNumber)
 print(result.SetNumber)
 print(result.Year)

For set lang attribute to tha

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Output will be in list format, each item represents result of library can extract, respectively.

{
   "Identification_Number": "9999999999999",
   "FullNameTH": "นาย อายุมฺมุราเสะ",
   "PrefixTH": "นาย",
   "NameTH": "อายุมฺมุราเสะ",
   "LastNameTH": "อายุมฺมุราเสะ",
   "BirthdayTH": "21 มี.ย. 2539",
   "Religion": "พุทธ",
   "Address": "ท๒ 99/1 มิชีโฮะ เขตฮานามิกาวา อำเภอชิบ;",
   "DateOfIssueTH": "11 ส.ค. 2554",
   "DateOfExpiryTH": "11 ส.ค. 2574"
}

And you can set ocr provider following below default #used both easyocr and tesseract **Recommend Or easyocr Or tesseract

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", provider="default", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Config Options

you can set options to Instance by below keyword

Parameter name Value Type Example
lang String Expected Results Language bash mix #get all area both tha and eng Or bash tha Or bash eng *Default is 'mix' สำหรับ DrivingLicense, PersonalCard
provider String OCR Provider have bash default #used both easyocr and tesseract **Recommend Or bash easyocr Or bash tesseract *Default is 'default' สำหรับ DrivingLicense, PersonalCard
template_threshold Double Rate to cals similarity of template *Default is 0.7
sift_rate Int Feature Keypoint rate *Default is 25,000
tesseract_cmd String Path of your tesseract command **For windows only.
save_extract_result Boolean Set True if you want to save extracted image *Default is False
path_to_save String Path that you given it save extracted image, relative with save_extract_result=True

Donate Me

promptpay

Mr.Nattapol Krobklang

You might also like...
extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTsite python Package extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install Get

A functional standard library for Python.

Toolz A set of utility functions for iterators, functions, and dictionaries. See the PyToolz documentation at https://toolz.readthedocs.io LICENSE New

🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

Retrying library for Python

Tenacity Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.
isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type. It provides a command line utility, Python library and plugins for various editors to quickly sort all your imports.

A Python library for reading, writing and visualizing the OMEGA Format
A Python library for reading, writing and visualizing the OMEGA Format

A Python library for reading, writing and visualizing the OMEGA Format, targeted towards storing reference and perception data in the automotive context on an object list basis with a focus on an urban use case.

RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.
pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.

pydsinternals - Directory Services Internals Library A Python native library containing necessary classes, functions and structures to interact with W

Comments
  • 'PersonalCard' object has no attribute 'extractInfo'

    'PersonalCard' object has no attribute 'extractInfo'


    AttributeError Traceback (most recent call last) /var/folders/cc/d15mjmhn5c5fscqfwq3fql5h0000gp/T/ipykernel_6920/1182933257.py in ----> 1 result = reader.extractInfo('ThaiPersonalCardExtract/examples/extract/image_scan.jpg') 2 print(result)

    AttributeError: 'PersonalCard' object has no attribute 'extractInfo'

    opened by suwika 0
Releases(v1.3.4)
  • v1.3.4(Sep 2, 2021)

    New Feature v1.3.4 🎁

    • Support Thai identity card laser code extract. (02 Sep. 2021)
    • Fix bug dataset folder not import thai_government_lottery resource. (23 Aug. 2021) #1
    • Increase performance.
    • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
    • Refactor Output Structure.
    • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Aug 16, 2021)

    New Feature v1.3.1 🎁 Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ Support Thai Government Lottery (16 Aug. 2021)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Aug 14, 2021)

    Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ ปรับเปลี่ยนรูปแบบไฟล์ระบบ

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Aug 13, 2021)

    New Feature 🎁

    • More arae extract.
    • lang : attribute : get only area of given language.
    • provider : attribute : set ocr provider now support easyocr and tesseract.
    Source code(tar.gz)
    Source code(zip)
  • v1.0-beta(Aug 11, 2021)

Owner
ggafiled
นานๆที อัพ ครับบบ.
ggafiled
Simple web index to use bloom filter for Pwned Passwords

pwbloom Simple web index to use bloom filter for Pwned Passwords The index.py runs a simple CGI web service checking passwords with a bloom filter for

Hanno Böck 4 Nov 23, 2021
A program to convert celcius to faranheit. made with python

Temp-Converter What is Temp-Converter Temp-Converter is little program made with pyhton to convert celcius to faranheit. Needed A python interpreter P

Chandula Janith 0 Nov 27, 2021
A thing to simplify listening for PG notifications with asyncpg

A thing to simplify listening for PG notifications with asyncpg

ANNA 18 Dec 23, 2022
A simulator for xkcd 2529's weirdly concrete problem

What is this? This is a quick hack implementation of a simulator for xkcd 2529's weirdly concrete problem. This is barely tested and I suck at computa

Reuben Steenekamp 6 Oct 27, 2021
A python module for extract domains

A python module for extract domains

Fayas Noushad 4 Aug 10, 2022
A Tool that provides automatic kerning for ligature based OpenType fonts in Microsoft Volt

Kerning A Tool that provides automatic kerning for ligature based OpenType fonts in Microsoft Volt There are three stages of the algorithm. The first

Sayed Zeeshan Asghar 6 Aug 01, 2022
An awesome tool to save articles from RSS feed to Pocket automatically.

RSS2Pocket An awesome tool to save articles from RSS feed to Pocket automatically. About the Project I used to use IFTTT to save articles from RSS fee

Hank Liao 10 Nov 12, 2022
Genart - Generate random art to sell as nfts

Genart - Generate random art to sell as nfts Usage git clone

Will 13 Mar 17, 2022
glip is a module for retrieve ip address like local-ip, global-ip, external-ip as string.

gle_ip_info glip is a module for retrieve ip address like local-ip, global-ip, external-ip as string.

Fatin Shadab 3 Nov 21, 2021
Creating low-level foundations and abstractions for asynchronous programming in Python.

DIY Async I/O Creating low-level foundations and abstractions for asynchronous programming in Python (i.e., implementing concurrency without using thr

Doc Jones 4 Dec 11, 2021
Dependency injection lib for Python 3.8+

PyDI Dependency injection lib for python How to use To define the classes that should be injected and stored as bean use decorator @component @compone

Nikita Antropov 2 Nov 09, 2021
a demo show how to dump lldb info to ida.

用一个demo来聊聊动态trace 这个仓库能做什么? 帮助理解动态trace的思想。仓库内的demo,可操作,可实践。 动态trace核心思想: 动态记录一个函数内每一条指令的执行中产生的信息,并导入IDA,用来弥补IDA等静态分析工具的不足。 反编译看一下 先clone仓库,把hellolldb

25 Nov 28, 2022
Homebase Name Changer for Fortnite: Save the World.

Homebase Name Changer This program allows you to change the Homebase name in Fortnite: Save the World. How to use it? After starting the HomebaseNameC

PRO100KatYT 7 May 21, 2022
one_click_kag_server is a program which tries to fully automate the creation of a King Arthur's Gold server.

one_click_kag_server is a program which tries to fully automate the creation of a King Arthur's Gold server.

Benjamin Gorman 4 Jan 05, 2022
RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

Max Bachmann 1.7k Jan 04, 2023
Random Name and Slug Generator

Random Name and Slug Generator

Alexander Lukanin 104 Nov 30, 2022
A Container for the Dependency Injection in Python.

Python Dependency Injection library aiodi is a Container for the Dependency Injection in Python. Installation Use the package manager pip to install a

Denis NA 3 Nov 25, 2022
Python lightweight dependency injection library

pythondi pythondi is a lightweight dependency injection library for python Support both sync and async functions Installation pip3 install pythondi Us

Hide 41 Dec 16, 2022
A Python utility belt containing simple tools, a stdlib like feel, and extra batteries. Hashing, Caching, Timing, Progress, and more made easy!

Ubelt is a small library of robust, tested, documented, and simple functions that extend the Python standard library. It has a flat API that all behav

Jon Crall 638 Dec 13, 2022
This is a package that allows you to create a key-value vault for storing variables in a global context

This is a package that allows you to create a key-value vault for storing variables in a global context. It allows you to set up a keyring with pre-defined constants which act as keys for the vault.

Data Ductus 2 Dec 14, 2022