Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Last update: Oct 15, 2021

Overview

Scene-Text-Detection-with-SPCNET

Unofficial repository for [Scene Text Detection with Supervised Pyramid Context Network][https://arxiv.org/abs/1811.08605] with tensorflow.

参考代码

网络实现主要借鉴Keras版本的Mask-RCNN,训练数据接口参考了argman/EAST.论文作者在知乎的文章介绍SPCNet.

训练

1、训练数据准备

训练数据放在data/下，训练数据准备在data/icdar.py：

data

icdar2017

Annotaions //image_1.txt
JPEGImages //image_1.jpg
train.txt //存储训练图片的名称，例如：image_1

2、参数修改

修改./train.py中的学习率、batch、模型存储路径等参数，如果需要调整网络参数，在nets/config.py中修改。

3、执行训练

python train.py

代码运行环境：Python2.7 tensorflow-gpu1.13 单张1080Ti

测试

修改demo.py中的模型文件夹路径、测试图片路径，然后执行python demo.py

测试结果：论文中还有一些地方我也不确定，因此目前没有在公开数据集测试。值得注意的是，按照原文中的训练说明，最好在多卡上训练，请加大你的batch size.

值得注意的地方

1、global text segmentation（gts）的训练

计算gts训练时损失函数时，我采用的方法是将feature pyramid的各个level产生的gts分别与全局mask gt计算softmax loss,然后取平均作为Loss_gts。因为没找到与原文关于这一块的描述，因此可能是其他的计算方法：每个level准备不同的mask_gt、将多个level的gts预测融合计算loss等等。感兴趣的可以去问问作者或者自己试试。

2、实现Rescore 时gts的选取

计算predict box对应的pyramid level,然后选取对应的gts计算。还有一种思路是：融合P2,P3,P4,P5的gts，然后计算box rescore.

3、Bounding Box的生成

MASK RCNN中是先对输出的box进行阈值过滤以及NMS，然后将剩余的回归之后的box对应的rois送入mask branch计算mask，目的是减少计算量同时获得更准确的mask。SPCNet为了减小FP与FN,对Inference流程做了修改：先对模型输出的box与mask进行Rescore,然后经过threshold filter，再对剩下的mask求Bounding Box,然后利用Poly NMS减少重叠，输出剩下的。

在目前代码（nets/models.py utils.py）里：是先对模型输出的box与mask进行Rescore,然后经过threshold filter与NMS，再对剩下的mask求Bounding Box,然后直接输出。

Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Related tags

Overview

Scene-Text-Detection-with-SPCNET

参考代码

训练

1、训练数据准备

2、参数修改

3、执行训练

测试

值得注意的地方

1、global text segmentation（gts）的训练

2、实现Rescore 时gts的选取

3、Bounding Box的生成

Owner

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

零样本学习测评基准，中文版

Course material for the Multi-agents and computer graphics course

Textboxes_plusplus implementation with Tensorflow (python)

Table recognition inside douments using neural networks

原神风花节自动弹琴辅助

This is a real life mario project using python and mediapipe

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

learn how to use Gesture Control to change the volume of a computer

SRA's seminar on Introduction to Computer Vision Fundamentals

Face Anonymizer - FaceAnonApp v1.0

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Text-to-Image generation

Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

chineseocr/table_line 表格线检测模型pytorch版

Links to awesome OCR projects

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.