A demo for end-to-end English and Chinese text spotting using ABCNet.

Last update: Oct 04, 2022

Related tags

Overview

ABCNet_Chinese

A demo for end-to-end English and Chinese text spotting using ABCNet. This is an old model that was trained a long ago, which serves as a base setting for others to train their own model on Chinese or other language. Official ABCNet_v2 models will be updated in AdelaiDet.

Installation

Install detectron2 using the provided version (support visualizing Chinese text):

python -m pip install -e d2

Install this repo:

python setup.py build develop

If the above succeed, you can now run the demo using the provided model.

Model

This is our model that can be used for evaluation or pretraining.

wget https://drive.google.com/file/d/1iWX2n_BmyltVwQmfj8_oM9z7cJlq1P0m/view?usp=sharing -O model_chn.pth

Simply put the model in the root directory of the repo.

Demo

bash demo.sh

Example results

If you successfully run the demo, you will get the output below:

Other results (same project but not using the provide model):

Document-like Ancient words, e.g., “彝文”:

Cite

If you find this repo useful, please cite:

@article{liu2021abcnet,
  title={ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting},
  author={Liu, Yuliang and Shen, Chunhua and Jin, Lianwen and He, Tong and Chen, Peng and Liu, Chongyu and Chen, Hao},
  journal={arXiv preprint arXiv:2105.03620},
  year={2021}
}

Data

We provide the converted json files of ArT, LSVT, and ReCTS that we have used for training ABCNet_Chinese.

ReCTs [images&label](1.7G) [Origin_of_dataset]
LSVT [images&label](8.2G) [Origin_of_dataset]
ArT [images&label](1.5G) [Origin_of_dataset]
SynChinese130k [images&label](25G) [Origin_of_dataset]

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact Chunhua Shen.

A demo for end-to-end English and Chinese text spotting using ABCNet.

Related tags

Overview

ABCNet_Chinese

Installation

Model

Demo

Example results

Cite

Data

License

Owner

Yuliang Liu

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

The implementation of Parameter Differentiation based Multilingual Neural Machine Translation

Must-read papers on improving efficiency for pre-trained language models.

Korean stereoypte detector with TUNiB-Electra and K-StereoSet

null

An implementation of WaveNet with fast generation

Pretrained Japanese BERT models

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Index different CKAN entities in Solr, not just datasets

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

JaQuAD: Japanese Question Answering Dataset

PyTorch Implementation of "Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised POS Tagging" (Findings of ACL 2022)

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Making text a first-class citizen in TensorFlow.

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

Speech Recognition for Uyghur using Speech transformer