Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Overview

Box_Discretization_Network

This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method [link], which won the championship.

PPT link [Google Drive][Baidu Cloud]

Generate your own JSON: [Google Drive][Baidu Cloud]

Brief introduction (in Chinese): [Google Drive][Baidu Cloud]

Competition related

Competition model and config files (it needs a lot of video memory):

  • Paper [Link] (Exploring the Capacity of Sequential-free Box Discretization Networkfor Omnidirectional Scene Text Detection)

  • Config file [BaiduYun Link]. Models below all use this config file except directory. Results below are the multi-scale ensemble results. The very details are described in our updated paper.

  • MLT 2017 Model [BaiduYun Link].

MLT 2017 Recall Precision Hmean
new 76.44 82.75 79.47
ReCTS Detection Recall Precision Hmean
new 93.97 92.76 93.36
HRSC_2016 Recall Precision Hmean TIoU-Hmean AP
IJCAI version 94.8 46.0 61.96 51.1 93.7
new 94.1 83.8 88.65 73.3 89.22
  • Online demo is updating (the old demo version used a wrong configuration). This demo uses the MLT model provided above. It can detect multi-lingual text but can only recognize English, Chinese, and most of the symbols.

Description

Please see our paper at [link].

The advantages:

  • BDN can directly produce compact quadrilateral detection box. (segmentation-based methods need additional steps to group pixels & such steps usually sensitive to outliers)
  • BDN can avoid label confusion (non-segmentation-based methods are mostly sensitive to label sequence, which can significantly undermine the detection result). Comparison on ICDAR 2015 dataset showing different methods’ ability of resistant to the label confusion issue (by adding rotated pseudo samples). Textboxes++, East, and CTD are all Sesitive-to-Label-Sequence methods.
Textboxes++ [code] East [code] CTD [code] Ours
Variances (Hmean) ↓ 9.7% ↓ 13.7% ↓ 24.6% ↑ 0.3%

Getting Started

A basic example for training and testing. This mini example offers a pure baseline that takes less than 4 hours (with 4 1080 ti) to finalize training with only official training data.

Install anaconda

Link:https://pan.baidu.com/s/1TGy6O3LBHGQFzC20yJo8tg psw:vggx

Step-by-step install

conda create --name mb
conda activate mb
conda install ipython
pip install ninja yacs cython matplotlib tqdm scipy shapely
conda install pytorch=1.0 torchvision=0.2 cudatoolkit=9.0 -c pytorch
conda install -c menpo opencv
export INSTALL_DIR=$PWD
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
cd $INSTALL_DIR
git clone https://github.com/Yuliang-Liu/Box_Discretization_Network.git
cd Box_Discretization_Network
python setup.py build develop
  • MUST USE torchvision=0.2

Pretrained model:

[Link] unzip under project_root

(This is ONLY an ImageNet Model With a few iterations on ic15 training data for a stable initialization)

ic15 data

Prepare data follow COCO format. [Link] unzip under datasets/

Train

After downloading data and pretrained model, run

bash quick_train_guide.sh

Test with [TIoU]

Run

bash my_test.sh

Put kes.json to ic15_TIoU_metric/ inside ic15_TIoU_metric/

Run (conda deactivate; pip install Polygon2)

python2 to_eval.py

Example results:

  • mask branch 79.4 (test segm.json by changing to_eval.py (line 10: mode=0) );
  • kes branch 80.4;
  • in .yaml, set RESCORING=True -> 80.8;
  • Set RESCORING=True and RESCORING_GAMA=0.8 -> 81.0;
  • One can try many other tricks such as CROP_PROB_TRAIN, ROTATE_PROB_TRAIN, USE_DEFORMABLE, DEFORMABLE_PSROIPOOLING, PNMS, MSR, PAN in the project, whcih were all tested effective to improve the results. To achieve state-of-the-art performance, extra data (syntext, MLT, etc.) and proper training strategies are necessary.

Visualization

Run

bash single_image_demo.sh

Citation

If you find our method useful for your reserach, please cite

@article{liu2019omnidirectional,
  title={Omnidirectional Scene Text Detection with Sequential-free Box Discretization},
  author={Liu, Yuliang and Zhang, Sheng and Jin, Lianwen and Xie, Lele and Wu, Yaqiang and Wang, Zhepeng},
  journal={IJCAI},
  year={2019}
}
@article{liu2019exploring,
  title={Exploring the Capacity of Sequential-free Box Discretization Network for Omnidirectional Scene Text Detection},
  author={Liu, Yuliang and He, Tong and Chen, Hao and Wang, Xinyu and Luo, Canjie and Zhang, Shuaitao and Shen, Chunhua and Jin, Lianwen},
  journal={arXiv preprint arXiv:1912.09629},
  year={2019}
}

Feedback

Suggestions and discussions are greatly welcome. Please contact the authors by sending email to [email protected] or [email protected]. For commercial usage, please contact Prof. Lianwen Jin via [email protected].

Owner
Yuliang Liu
MMLab; South China University of Technology; University of Adelaide
Yuliang Liu
Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Testability-Aware Low Power Controller Design with Evolutionary Learning This repo contains the source code of Testability-Aware Low Power Controller

Lee Man 1 Dec 26, 2021
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Ren Yurui 261 Jan 09, 2023
Algorithmic Trading using RNN

Deep-Trading This an implementation adapted from Rachnog Neural networks for algorithmic trading. Part One — Simple time series forecasting and this c

Hazem Nomer 29 Sep 04, 2022
DLFlow is a deep learning framework.

DLFlow是一套深度学习pipeline,它结合了Spark的大规模特征处理能力和Tensorflow模型构建能力。利用DLFlow可以快速处理原始特征、训练模型并进行大规模分布式预测,十分适合离线环境下的生产任务。利用DLFlow,用户只需专注于模型开发,而无需关心原始特征处理、pipeline构建、生产部署等工作。

DiDi 152 Oct 27, 2022
Forecasting with Gradient Boosted Time Series Decomposition

ThymeBoost ThymeBoost combines time series decomposition with gradient boosting to provide a flexible mix-and-match time series framework for spicy fo

131 Jan 08, 2023
Two types of Recommender System : Content-based Recommender System and Colaborating filtering based recommender system

Recommender-Systems Two types of Recommender System : Content-based Recommender System and Colaborating filtering based recommender system So the data

Yash Kumar 0 Jan 20, 2022
[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

MedMNIST Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21) Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bili

683 Dec 28, 2022
An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Retina Blood Vessels Segmentation This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional

Srijarko Roy 23 Aug 20, 2022
A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)

G3NN This repo provides a pytorch implementation for the 4 instantiations of the flexible generative framework as described in the following paper: A

Jiaqi Ma 14 Oct 11, 2022
Tools for the Cleveland State Human Motion and Control Lab

Introduction This is a collection of tools that are helpful for gait analysis. Some are specific to the needs of the Human Motion and Control Lab at C

CSU Human Motion and Control Lab 88 Dec 16, 2022
TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

TART This project is a PyTorch implementation for Transition Matrix Representati

Lee Sael 2 Jan 19, 2022
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

15 Nov 30, 2022
My implementation of Image Inpainting - A deep learning Inpainting model

Image Inpainting What is Image Inpainting Image inpainting is a restorative process that allows for the fixing or removal of unwanted parts within ima

Joshua V Evans 1 Dec 12, 2021
Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN"

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

68 Dec 21, 2022
Get a Grip! - A robotic system for remote clinical environments.

Get a Grip! Within clinical environments, sterilization is an essential procedure for disinfecting surgical and medical instruments. For our engineeri

Jay Sharma 1 Jan 05, 2022
A script that trains a model to recognize handwritten digits using the MNIST data set.

handwritten-digits-recognition A script that trains a model to recognize handwritten digits using the MNIST data set. Then it loads external files and

Hamza Sayih 1 Oct 30, 2021
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
PConv-Keras - Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Partial Convolutions for Image Inpainting using Keras Keras implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions", https

Mathias Gruber 871 Jan 05, 2023
Differentiable Simulation of Soft Multi-body Systems

Differentiable Simulation of Soft Multi-body Systems Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin [Paper] [Code] Updates The C++ backend s

YilingQiao 26 Dec 23, 2022
Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 09, 2022