CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

Overview

CUTIE

TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohui Zhao Paper Link


CUTIE 是用于“票据文档” 2D 关键信息提取/命名实体识别/槽位填充 算法。 使用CUTIE前,需先使用OCR算法对“票据文档” 中的文字执行检测和识别,而后将格式化的文本输入入CUTIE网络,具体流程可参照论文。

CUTIE can be considered as one type of 2-Dimensional Key Information Extraction, 2-D NER (Named Entity Recognition) or a 2-Dimensional 2D Slot Filling algorithm. Before training / inference with CUTIE, prepare your structured texts in your scanned document images with any type of OCR algorithm. Refer to the CUTIE paper for details about the procedure.

Results

Result evaluated on 4,484 receipt documents, including taxi receipts, meals entertainment receipts, and hotel receipts, with 9 different key information classes. (AP / softAP)

Method #Params Taxi Hotel
CloudScan - 82.0 / - 60.0 / -
BERT 110M 88.1 / - 71.7 / -
CUTIE 14M 94.0 / 97.3 74.6 / 87.0

Taxi

Hotel

Installation & Usage

pip install -r requirements.txt
  1. Generate your own dictionary with main_build_dict.py / main_data_tokenizer.py
  2. Train your model with main_train_json.py

CUTIE achieves best performance with rows/cols well configured. For more insights, refer to statistics in the file (others/TrainingStatistic.xlsx).

Chart

Others

For information about the input example, refer to issue discussion.

  • Apply any OCR tool that help you detecting and recognizing words in the scanned document image.
  • Label image OCR results with key information class as the .json file in the invoice_data folder. (thanks to @4kssoft)
Owner
Zhao,Xiaohui
Zhao,Xiaohui
Table recognition inside douments using neural networks

TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco

Giovanni Cavallin 93 Jul 24, 2022
Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

Edgard Chammas 346 Jan 07, 2023
STEFANN: Scene Text Editor using Font Adaptive Neural Network

STEFANN: Scene Text Editor using Font Adaptive Neural Network @ The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.

Prasun Roy 208 Dec 11, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Image augmentation for machine learning experiments.

imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar

Alexander Jung 13.2k Jan 02, 2023
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 07, 2022
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

tooraj taraz 3 Feb 10, 2022
Markup for note taking

Subtext: markup for note-taking Subtext is a text-based, block-oriented hypertext format. It is designed with note-taking in mind. It has a simple, pe

Gordon Brander 224 Jan 01, 2023
Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

22 Dec 08, 2022
An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023
A fastai/PyTorch package for unpaired image-to-image translation.

Unpaired image-to-image translation A fastai/PyTorch package for unpaired image-to-image translation currently with CycleGAN implementation. This is a

Tanishq Abraham 120 Dec 02, 2022
Steve Tu 71 Dec 30, 2022
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022
基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化

SimpleRPA 基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化 简介 SimpleRPA是一款python语言编写的开源RPA工具(桌面自动控制工具),用户可以通过配置yaml格式的文件,来实现桌面软件的自动化控制,简化繁杂重复的工作,比如运营人员给用户发消息,

Song Hui 7 Jun 26, 2022
Awesome Spectral Indices in Python.

Awesome Spectral Indices in Python: Numpy | Pandas | GeoPandas | Xarray | Earth Engine | Planetary Computer | Dask GitHub: https://github.com/davemlz/

David Montero Loaiza 98 Jan 02, 2023
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 06, 2021
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022