AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

Last update: Dec 29, 2022

Overview

AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Detector, and the significant improvement was also made, which make long text predictions more accurate. If this project is helpful to you, welcome to star. And if you have any problem, please contact me.

email:[email protected]
website:https://huoyijie.cn

advantages

writen in keras, easy to read and run
base on EAST, an advanced text detect algorithm
easy to train the model
significant improvement was made, long text predictions more accurate.(please see 'demo results' part bellow, and pay attention to the activation image, which starts with yellow grids, and ends with green grids.)

In my experiments, AdvancedEast has obtained much better prediction accuracy then East, especially on long text. Since East calculates final vertexes coordinates with weighted mean values of predicted vertexes coordinates of all pixels. It is too difficult to predict the 2 vertexes from the other side of the quadrangle. See East limitations picked from original paper bellow.

project files

config file:cfg.py,control parameters
pre-process data: preprocess.py,resize image
label data: label.py,produce label info
define network network.py
define loss function losses.py
execute training advanced_east.py and data_generator.py
predict predict.py and nms.py

后置处理过程说明参见后置处理(含原理图)

network arch

AdvancedEast

网络输出说明：输出层分别是1位score map, 是否在文本框内；2位vertex code，是否属于文本框边界像素以及是头还是尾；4位geo，是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状，然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素，是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点，最后得到4个顶点坐标。

原理简介(含原理图)

East

setup

python 3.6.3+
tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
keras 2.1.4+
numpy 1.14.1+
tqdm 4.19.7+

training

tianchi ICPR dataset download 链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y
prepare training data:make data root dir(icpr), copy images to root dir, and copy txts to root dir, data format details could refer to 'ICPR MTWI 2018 挑战赛二：网络图像的文本检测', Link
modify config params in cfg.py, see default values.
python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.
python label.py
python advanced_east.py, train entrance
python predict.py -p demo/001.png, to predict
pretrain model download(use for test) 链接: https://pan.baidu.com/s/1KO7tR_MW767ggmbTjIJpuQ 密码: kpm2

demo results

compared with east based on vgg16

As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.

License

The codes are released under the MIT License.

references

原理简介(含原理图)

后置处理过程说明参见后置处理(含原理图)

A Simple RaspberryPi Car Project

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

Related tags

Overview

AdvancedEAST

advantages

project files

network arch

setup

training

demo results

License

references

Owner

huoyijie

Smart computer vision application

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Zoom , GoogleMeets에서 Vtuber 데뷔하기

Virtual Zoom Gesture using OpenCV

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

graph learning code for ogb

Detect textlines in document images

Select range and every time the screen changes, OCR is activated.

Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

A pure pytorch implemented ocr project including text detection and recognition

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

An interactive document scanner built in Python using OpenCV

PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.