This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

Textboxes implementation with Tensorflow (python)

Write-ups for the SwissHackingChallenge2021 CTF.

Python Computer Vision from Scratch

Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

OpenGait is a flexible and extensible gait recognition project

The virtual calculator will be above the live streaming from your camera

keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》；欢迎试用，关注，并反馈问题...

governance proposal to make fei redeemable for eth

PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

This repository provides train＆test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Python rubik's cube solver

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

Ddddocr - 通用验证码识别OCR pypi版

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

An interactive document scanner built in Python using OpenCV

Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.