Official implementations of PSENet, PAN and PAN++.

Overview

News

  • (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23.
  • (2021/04/08) PSENet and PAN are included in MMOCR.

Introduction

This repository contains the official implementations of PSENet, PAN, PAN++, and FAST [coming soon].

Text Detection
Text Spotting

Installation

First, clone the repository locally:

git clone https://github.com/whai362/pan_pp.pytorch.git

Then, install PyTorch 1.1.0+, torchvision 0.3.0+, and other requirements:

conda install pytorch torchvision -c pytorch
pip install -r requirement.txt

Finally, compile codes of post-processing:

# build pse and pa algorithms
sh ./compile.sh

Dataset

Please refer to dataset/README.md for dataset preparation.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_ic15.py

Testing

Evaluate the performance

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
cd eval/
./eval_{DATASET}.sh

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
cd eval/
./eval_ic15.sh

Evaluate the speed

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --report_speed

Citation

Please cite the related works in your publications if it helps your research:

PSENet

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection with Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PAN

@inproceedings{wang2019efficient,
  title={Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network},
  author={Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8440--8449},
  year={2019}
}

PAN++

@article{wang2021pan++,
  title={PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Liu, Xuebo and Liang, Ding and Zhibo, Yang and Lu, Tong and Shen, Chunhua},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

FAST

@misc{chen2021fast,
  title={FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation}, 
  author={Zhe Chen and Wenhai Wang and Enze Xie and ZhiBo Yang and Tong Lu and Ping Luo},
  year={2021},
  eprint={2111.02394},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

License

This project is developed and maintained by IMAGINE [email protected] Key Laboratory for Novel Software Technology, Nanjing University.

IMAGINE Lab

This project is released under the Apache 2.0 license.

Comments
  • Evaluation of the performance result

    Evaluation of the performance result

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by dikubab 9
  • _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    _pickle.PicklingError: Can't pickle : import of module 'cPolygon' failed

    more complete log as belows: Epoch: [1 | 600] /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) (1/374) LR: 0.001000 | Batch: 2.668s | Total: 0min | ETA: 17min | Loss: 1.619 | Loss(text/kernel/emb/rec): 0.680/0.193/0.746/0.000 | IoU(text/kernel): 0.324/0.335 | Acc rec: 0.000 Traceback (most recent call last): File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/queues.py", line 236, in _feed obj = _ForkingPickler.dumps(obj) File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    the code runs normally when using the CTW1500 datasets. but encounter errors when using my own datasets.

    it seems fine in the first run (1/374), what is wrong ? I have no idea.

    opened by Zhang-O 5
  • 关于训练的问题

    关于训练的问题

    您好!我现在在自己的数据上进行训练,训练过程是这样的 image Epoch: [212 | 600] (1/198) LR: 0.000677 | Batch: 3.934s | Total: 0min | ETA: 13min | Loss: 0.752 | Loss(text/kernel/emb/rec): 0.493/0.199/0.059/0.000 | IoU(text/kernel): 0.055/0.553 | Acc rec: 0.000 (21/198) LR: 0.000677 | Batch: 1.089s | Total: 0min | ETA: 3min | Loss: 0.731 | Loss(text/kernel/emb/rec): 0.478/0.199/0.054/0.000 | IoU(text/kernel): 0.048/0.482 | Acc rec: 0.000 (41/198) LR: 0.000677 | Batch: 1.022s | Total: 1min | ETA: 3min | Loss: 0.732 | Loss(text/kernel/emb/rec): 0.478/0.198/0.056/0.000 | IoU(text/kernel): 0.049/0.476 | Acc rec: 0.000 这个Acc rec一直是0,我终止训练后,在测试数据上进行测试时,output输出的是空的,请问是怎么回事呢,感谢啦!

    opened by mayidu 3
  • 关于后处理的疑问

    关于后处理的疑问

    1. 后处理的代码中当kernel中两个连通域的面积比大于max_rate时,将这两个连通域的flag赋值为1,在扩充时,必须同时满足当前扩充的点所属的连通域的flag值为1且与kernal的similar vector距离大于3时才不扩充该点。请问设flag这步操作的作用是什么,直接判断与Kernel的similar vector的距离可以吗?
    2. 论文中扩充的点与kernel相似向量的欧式距离thresh值为6,代码中为3,请问实际应用中这个值跟什么有关系,是数据集的某些特点吗?
    opened by jewelc92 3
  • Regarding pa.pyx

    Regarding pa.pyx

    Hi,

    I try to run your code and figure out that in your last line in pa.pyx

    return _pa(kernels[:-1], emb, label, cc, kernel_num, label_num, min_area)

    Looks like this should be

    return _pa(kernels, emb, label, cc, kernel_num, label_num, min_area)

    So that we can scan over all kernels (you skip the last kernel) and there is no crash in this function. Am I correct?

    Thanks.

    opened by liuch37 3
  • AttributeError: 'Namespace' object has no attribute 'resume'

    AttributeError: 'Namespace' object has no attribute 'resume'

    PAN++ic15,An error appears when trying to test the model:

    reading type: pil. Traceback (most recent call last): File "test.py", line 155, in main(args) File "test.py", line 138, in main print("No checkpoint found at '{}'".format(args.resume)) AttributeError: 'Namespace' object has no attribute 'resume'

    opened by lrjj 2
  • 训练Total Text时遇到的问题

    训练Total Text时遇到的问题

    运行 python train.py config/pan/pan_r18_tt.py 后,出现如下情况: p1 Traceback (most recent call last): File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/queues.py", line 234, in _feed obj = _ForkingPickler.dumps(obj) File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed 似乎是迭代过程中出现的问题且只出现在训练TT数据集的时候 请问出现这种情况该怎样解决呢?谢谢您

    opened by mashumli 2
  • 执行test.py提示TypeError: 'module' object is not callable

    执行test.py提示TypeError: 'module' object is not callable

    将模型路径和config文件路径配置好了之后,执行python test.py,提示如下: Traceback (most recent call last): File "test.py", line 117, in main(args) File "test.py", line 107, in main test(test_loader, model, cfg) File "test.py", line 56, in test outputs = model(**data) File "/home/ethony/anaconda3/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/pan.py", line 104, in forward det_res = self.det_head.get_results(det_out, img_metas, cfg) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/head/pa_head.py", line 65, in get_results label = pa(kernels, emb) TypeError: 'module' object is not callable 看提示应该是model/post_processing下的pa没有正确导入,导入为模块了,这应该怎么解决呢

    opened by ethanlighter 2
  • problems in train.py

    problems in train.py

    Hi. When I run 'python train.py config/pan/pan_r18_ic15.py' , the errors are as followings: Do you know how to solve the problem? Thank you very much. Traceback (most recent call last): File "train.py", line 234, in main(args) File "train.py", line 216, in main train(train_loader, model, optimizer, epoch, start_iter, cfg) File "train.py", line 41, in train for iter, data in enumerate(train_loader): File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next data = self._next_data() File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data return self._process_data(data) File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data data.reraise() File "D:\Anaconda3\lib\site-packages\torch_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

    opened by YUDASHUAI916 2
  • not sure about run compile.sh

    not sure about run compile.sh

    (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$ sh ./compile.sh Compiling pa.pyx because it depends on /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/init.pxd. [1/1] Cythonizing pa.pyx /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.pyx tree = Parsing.p_module(s, pxd, full_module_name) running build_ext building 'pa' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include -I/data/tools/anaconda3/envs/zyl_torch16/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4, from pa.cpp:647: /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it with "
    ^~~~~~~ g++ -pthread -shared -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -L/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,-rpath=/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/pa.o -o /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.cpython-37m-x86_64-linux-gnu.so (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$

    this is the compile history, I am not sure whether is successully build or not.

    opened by Zhang-O 2
  • morphology operations from kornia

    morphology operations from kornia

    Hi,

    Your FAST paper is really amazing! While you already have an implementation of erosion/dilation, let me offer using our set of morphology, implemented in pyre pytorch: https://kornia.readthedocs.io/en/latest/morphology.html

    https://kornia-tutorials.readthedocs.io/en/master/morphology_101.html

    Best, Dmytro.

    opened by ducha-aiki 1
  • The sample 199 not present in GT

    The sample 199 not present in GT

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by zeng-cy 1
  • How  to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar cd eval/ ./eval_ic15.sh

    please inform me with [email protected] or wechat SanQian-2012,thanks you so much.

    Originally posted by @Devin521314 in https://github.com/whai362/pan_pp.pytorch/issues/91#issuecomment-1233810612

    opened by Devin521314 0
  • Why rec encoder use EOS? not SOS

    Why rec encoder use EOS? not SOS

    hi: I find there is no 'SOS' in code, I understand SOS should be embedding at the beginning. Please tell me ,thanks! ---------------code----------------------------------------------- class Encoder(nn.Module): def init(self, hidden_dim, voc, char2id, id2char): super(Encoder, self).init() self.hidden_dim = hidden_dim self.vocab_size = len(voc) self.START_TOKEN = char2id['EOS'] self.emb = nn.Embedding(self.vocab_size, self.hidden_dim) self.att = MultiHeadAttentionLayer(self.hidden_dim, 8)

    def forward(self, x):
        batch_size, feature_dim, H, W = x.size()
        x_flatten = x.view(batch_size, feature_dim, H * W).permute(0, 2, 1)
        st = x.new_full((batch_size,), self.START_TOKEN, dtype=torch.long)
        emb_st = self.emb(st)
        holistic_feature, _ = self.att(emb_st, x_flatten, x_flatten)
        return 
    
    opened by Patickk 0
Releases(v1)
AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition.

AnimalAI 3 AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition. It aims to support AI research t

Matthew Crosby 58 Dec 12, 2022
TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials

TorchMD-net TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular

TorchMD 104 Jan 03, 2023
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)

Cross View Transformers This repository contains the source code and data for our paper: Cross-view Transformers for real-time Map-view Semantic Segme

Brady Zhou 363 Dec 25, 2022
Display, filter and search log messages in your terminal

Textualog Display, filter and search logging messages in the terminal. This project is powered by rich and textual. Some of the ideas and code in this

Rik Huygen 24 Dec 10, 2022
Official PyTorch Implementation of "Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs". NeurIPS 2020.

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs This repository is the implementation of SELAR. Dasol Hwang* , Jinyoung Pa

MLV Lab (Machine Learning and Vision Lab at Korea University) 48 Nov 09, 2022
EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

Gender Bangs Body Side Pose (Yaw) Lighting Smile Face Shape Lipstick Color Painting Style Pose (Yaw) Pose (Pitch) Zoom & Rotate Flush & Eye Color Mout

Zhenliang He 321 Dec 01, 2022
Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

94 Nov 22, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Receptive Field Block Net for Accurate and Fast Object Detection By Songtao Liu, Di Huang, Yunhong Wang Updatas (2021/07/23): YOLOX is here!, stronger

Liu Songtao 1.4k Dec 21, 2022
Python code for loading the Aschaffenburg Pose Dataset.

Aschaffenburg Pose Dataset (APD) This repository contains Python code for loading and filtering the Aschaffenburg Pose Dataset. The dataset itself and

1 Nov 26, 2021
Neural Network to colorize grayscale images

#colornet Neural Network to colorize grayscale images Results Grayscale Prediction Ground Truth Eiji K used colornet for anime colorization Sources Au

Pavel Hanchar 3.6k Dec 24, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
Official implementation of the paper DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows

DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows Official implementation of the paper DeFlow: Learning Complex Im

Valentin Wolf 86 Nov 16, 2022
Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Trainable multi-codebook quantization This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer

Daniel Povey 41 Jan 07, 2023
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

How to Reproduce our Results This repository contains PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Represen

opcrisis 46 Dec 15, 2022
Image Matching Evaluation

Image Matching Evaluation (IME) IME provides to test any feature matching algorithm on datasets containing ground-truth homographies. Also, one can re

32 Nov 17, 2022
This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Rotate-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes. Section I. Description The codes are

xinzelee 90 Dec 13, 2022