A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Last update: Dec 26, 2022

Related tags

Deep Learning PAN.pytorch

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

pytorch 1.1+
torchvision 0.3+
pyclipper
opencv3
gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015：

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

config the train_data_path,val_data_pathin config.json
use following script to run

python3 train.py

Test

eval.py is used to test model on test dataset

config model_path, img_path, gt_path, save_path in eval.py
use following script to test

python3 eval.py

Predict

predict.py is used to inference on single image

config model_path, img_path, in predict.py
use following script to predict

python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Method	image size (short size)	learning rate	Precision (%)	Recall (%)	F-measure (%)	FPS
paper(resnet18)	736	x	x	x	80.4	26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张)	736	1e-3	81.72	66.73	73.47	24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)	736	1e-3	84.93	74.09	79.14	21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)	736	1e-3	84.23	76.12	79.96	14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张)	736	1e-4	75.14	57.34	65.04	24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)	736	1e-4	83.89	69.23	75.86	21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)	736	1e-4	85.29	75.1	79.87	14.22 (P100)
my (resnet18+FPN+pse扩张)	736	1e-3	76.50	74.70	75.59	14.47 (P100)
my (resnet50+FPN+pse扩张)	736	1e-3	71.82	75.73	73.72	10.67 (P100)
my (resnet18+FPN+pse扩张)	736	1e-4	74.19	72.34	73.25	14.47 (P100)
my (resnet50+FPN+pse扩张)	736	1e-4	78.96	76.27	77.59	10.67 (P100)

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Related tags

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

Download

Data Preparation

Train

Test

Predict

Performance

ICDAR 2015

examples

todo

reference

Owner

zhoujun

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Functional deep learning

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

A full-fledged version of Pix2Seq

Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》

SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP

TF Image Segmentation: Image Segmentation framework

A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

HyperDict - Self linked dictionary in Python

“Robust Lightweight Facial Expression Recognition Network with Label Distribution Training”, AAAI 2021.

ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

Official implementation of "Refiner: Refining Self-attention for Vision Transformers".

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Use AI to generate a optimized stock portfolio