This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Last update: Dec 22, 2022

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: https://papers.nips.cc/paper/6637-gated-recurrent-convolution-neural-network-for-ocr.pdf

Update

The journal version of GRCNN has been accepted by T-PAMI 2021, and the code is available at:

https://github.com/Jianf-Wang/GRCNN

Build

The GRCNN is built upon the CRNN. The requirements are:

Ubuntu 14.04
CUDA 7.5
CUDNN 5

For the convenience of compiling, we provide the dependencies from here: https://pan.baidu.com/s/1c21zl1e#list/path=%2F

It is more convenient if you use nivdia-docker image (@rremani supplied) : https://hub.docker.com/r/rremani/cuda_crnn_torch/

After installing the dependencies, go to src/ and execute build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Inference

We provide the pretrained model from here. Put the downloaded model file into directory model/GRCL/. Moreover, we provide the IC03 dataset in the "./data/IC03" directory. You need to change the directories listed in the "test.txt". The "test_label.txt" is the ground truth of each image. The "lexicon_50.txt" is the lexicon of IC03.

"src/evaluation.lua": Lexicon-free evaluation

"src/evaluation_lex.lua" Lexicon-based evaluation

The evaluation code will output the recognition accuracy.

Train a new model

Follow the following steps to train a new model on your own dataset.

Create a new LMDB dataset.src/create_own_dataset.py(need to pip install lmdb first).
You can modify the configuration in model/GRCL/GRCL_LSTM_pretrain.lua
Go to src/ and execute th main_train.lua ../model/GRCL/ ../model/saved_model. Model snapshots will be saved into ../model/saved_model.

Visualization

We visualize the RCNN , DenseNet and GRCNN to verify the dynamic receptive fields in GRCNN for OCR. There are clearly gaps among different characters, and for each character, the unrelated parts do not provide strong signal.

Citation

@inproceedings{jianfeng2017deep,
 author    = {Wang, Jianfeng and Hu, Xiaolin},
 title     = {Gated Recurrent Convolution Neural Network for OCR},
 booktitle = {Advances in Neural Information Processing Systems},
 year      = {2017}
}

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

Update

Build

Inference

Train a new model

Visualization

Citation

Owner

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

一键翻译各类图片内文字

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

The CIS OCR PostCorrectionTool

A tool to enhance your old/damaged pictures built using python & opencv.

Maze generator and solver with python

Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

Convert scans of handwritten notes to beautiful, compact PDFs

A toolbox of scene text detection and recognition

A simple component to display annotated text in Streamlit apps.

Qrcode Attendence System with Opencv and Pyzbar

Hand Detection and Finger Detection on Live Feed

Page to PAGE Layout Analysis Tool

Pixie - A full-featured 2D graphics library for Python

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

A curated list of papers and resources for scene text detection and recognition

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

A post-processing tool for scanned sheets of paper.