Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Last update: Aug 07, 2022

Related tags

Deep Learning NonLatinPhotoOCR

Overview

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net

Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition"

Dependence

Python3.6.5
torch==1.2.0
torchvision==0.4.0
tensorboard==2.3.0

How to run the code?

Prepare data

Follow the instructions in meijieru/crnn.pytorch to create lmdb datasets. Use the same step to create train and val data.

Change parameters and alphabets

Please update the parameters and alphabets according to the requirement.

Change parameters in the mytrain.py file
Change alphabets

Please put all the alphabets that appear in your labels in a file and input the list as charlist to mytrain.py, else the program will throw an error during training.

Train

Run mytrain.py -

python3 mytrain.py --trainRoot /ssd_scratch/cvit/sanjana/hindi-train-lmdb \
--valRoot /ssd_scratch/cvit/sanjana/hindi-test-lmdb \
--arch crnn --lan hindi --charlist /ssd_scratch/cvit/sanjana/crnn_new/lexicon.txt \
--batchSize 32 --nepoch 15 --cuda --expr_dir /ssd_scratch/cvit/sanjana \
--displayInterval 10 --valInterval 100 --adadelta \ 
--manualSeed 1234 --random_sample --deal_with_lossnan

Reference

meijieru/crnn.pytorch
Sierkinhane/crnn_chinese_characters_rec

If you use the dataset or code from this work, please add the following citation:-

@inproceedings{gunnaNonLatin2021,
  title={Towards {B}oosting the {A}ccuracy of {N}on-{L}atin {S}cene {T}ext {R}ecognition,
  author={Sanjana Gunna and Rohit Saluja and C V Jawahar},
  booktitle={2021 International Conference on Document Analysis and Recognition Workshops (ICDARW)},
  year={2021},
  organization={IEEE}
}

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Related tags

Overview

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net

Dependence

How to run the code?

Prepare data

Change parameters and alphabets

Train

Reference

Owner

Sanjana Gunna

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

[NeurIPS'21 Spotlight] PyTorch code for our paper "Aligned Structured Sparsity Learning for Efficient Image Super-Resolution"

It's A ML based Web Site build with python and Django to find the breed of the dog

General Assembly Capstone: NBA Game Predictor

Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

A PyTorch-based Semi-Supervised Learning (SSL) Codebase for Pixel-wise (Pixel) Vision Tasks

An executor that performs image segmentation on fashion items

HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

Feedback is important: response-aware feedback mechanism for background based conversation

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.

LaBERT - A length-controllable and non-autoregressive image captioning model.

Si Adek Keras is software VR dangerous object detection.

Using deep actor-critic model to learn best strategies in pair trading

Empowering journalists and whistleblowers

Tensorflow2 Keras-based Semantic Segmentation Models Implementation

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.