SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

This is a PyTorch implementation of "SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training"

The paper propose a novel text detection system termed SelfText Beyond Polygon(SBP) with Bounding Box Supervision(BBS) and Dynamic Self Training~(DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. As shown in the Figure, SBP achieves the same performance as strong supervision while saving huge data annotation costs.

From more details,please refer to our arXiv paper

Environments

python 3
torch = 1.1.0
torchvision
Pillow
numpy

ToDo List

Dataset

Supported:

model zoo

Supported text detection:

Bounding Box Supervision(BBS)

Train

The training strategy includes three steps: (1) training SASN with synthetic data (2) generating pseudo label on real data based on bounding box annotation with SASN (3) training the detectors(EAST and PSENet) with the pseudo label

training SASN with synthtext or curved synthtext

(TDB)

generating pseudo label on real data with SASN

(TDB)

training EAST or PSENet with the pseudo label

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Dynamic Self Training

Train

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	ICDAR15	-	65.8	63.8	64.8
EAST	ICDAR15	-	76.9	77.1	77.0
EAST_pseudo(SynthText)	ICDAR15	-	77.8	78.2	78.0
EAST_box	ICDAR15	SynthText	70.8	72.0	71.4
EAST	ICDAR15	SynthText	82.0	82.4	82.2
EAST_pseudo(SynthText)	ICDAR15	SynthText	81.3	82.2	81.8

The performance of EAST on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	MSRA-TD500	-	40.49	31.05	35.15
EAST	MSRA-TD500	-	71.76	69.05	70.38
EAST_pseudo(SynthText)	MSRA-TD500	-	71.27	67.54	69.36
EAST_box	MSRA-TD500	SynthText	48.34	42.37	45.16
EAST	MSRA-TD500	SynthText	77.91	76.45	77.17
EAST_pseudo(SynthText)	MSRA-TD500	SynthText	77.42	73.85	75.59

The performance of PSENet on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	ICDAR15	-	70.17	69.09	69.63
PSENet	ICDAR15	-	81.6	79.5	80.5
PSENet_pseudo(SynthText)	ICDAR15	-	82.9	77.6	80.2
PSENet_box	ICDAR15	SynthText	72.65	74.29	73.46
PSENet	ICDAR15	SynthText	86.42	83.54	84.96
PSENet_pseudo(SynthText)	ICDAR15	SynthText	86.77	83.34	85.02

The performance of PSENet on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	MSRA-TD500	-	47.17	36.90	41.41
PSENet	MSRA-TD500	-	80.86	77.72	79.13
PSENet_pseudo(SynthText)	MSRA-TD500	-	80.32	77.26	78.86
PSENet_box	MSRA-TD500	SynthText	47.45	39.49	43.11
PSENet	MSRA-TD500	SynthText	84.11	84.97	84.54
PSENet_pseudo(SynthText)	MSRA-TD500	SynthText	84.03	84.03	84.03

The performance of PSENet on Total Text

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	Total Text	-	46.5	43.6	45.0
PSENet	Total Text	-	80.4	76.5	78.4
PSENet_pseudo(SynthText)	Total Text	-	80.33	73.54	76.78
PSENet_pseudo(Curved SynthText)	Total Text	-	81.68	74.61	78.0
PSENet_box	Total Text	SynthText	51.94	47.45	49.59
PSENet	Total Text	SynthText	83.4	78.1	80.7
PSENet_pseudo(SynthText)	Total Text	SynthText	81.57	75.54	78.44
PSENet_pseudo(Curved SynthText)	Total Text	SynthText	82.51	77.57	80.0

The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text

links

https://github.com/SakuraRiven/EAST

https://github.com/WenmuZhou/PSENet.pytorch

License

For academic use, this project is licensed under the Apache License - see the LICENSE file for details. For commercial use, please contact the authors.

Citations

Please consider citing our paper in your publications if the project helps your research.

Eamil: [email protected]

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Related tags

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

Environments

ToDo List

Dataset

model zoo

Bounding Box Supervision(BBS)

Train

training SASN with synthtext or curved synthtext

generating pseudo label on real data with SASN

training EAST or PSENet with the pseudo label

Eval

Visualization

Dynamic Self Training

Train

Eval

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

The performance of EAST on MSRA-TD500

The performance of PSENet on ICDAR15

The performance of PSENet on MSRA-TD500

The performance of PSENet on Total Text

links

License

Citations

Owner

weijiawu

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Research shows Google collects 20x more data from Android than Apple collects from iOS. Block this non-consensual telemetry using pihole blocklists.

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

Recreate CenternetV2 based on MMDET.

Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

ML course - EPFL Machine Learning Course, Fall 2021

Registration Loss Learning for Deep Probabilistic Point Set Registration

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

Neural Oblivious Decision Ensembles

Code for Learning to Segment The Tail (LST)

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

A library for Deep Learning Implementations and utils

High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

PyTorch Implement of Context Encoders: Feature Learning by Inpainting

This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

Official page of Patchwork (RA-L'21 w/ IROS'21)