Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Last update: Jan 02, 2023

Related tags

Deep Learning pren

Overview

Primitive Representation Learning Network (PREN)

This repository contains the code for our paper accepted by CVPR 2021

Primitive Representation Learning for Scene Text Recognition

Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao

For now we only provide code for PREN.

Requirements

python 3.7.9, pytorch 1.4.0, and torchvision 0.5.0
other libraries can be installed by

pip install -r requirements.txt

Recognition with pretrained model

We provide code for using our pretrained model to recognize text images.

The pretrained model can be downloaded via Baidu net disk: download_link key: 2txt
After downloading the pretrained model (pren.pth), put it in the "models" folder.
To recognize three samples in the "samples" folder, just run

python recog.py

The results would be

[Info] Load model from ./models/pren.pth
samples/001.jpg: ronaldo
samples/002.png: leaves
samples/003.jpg: salmon

Training

Two simple steps to train your own model:

Modify training configurations in Configs/trainConf.py
Run python train.py

To run the training code, please modify image_dir and train_list to your own training data.

image_dir is the path of training data root.

train_list is the path of a text file containing image paths (relative to image_dir) and corresponding labels.

For example, image_dir could be './samples', and train_list could be a text file with the following content

001.jpg RONALDO
002.png LEAVES
003.jpg SALMON

Evaluation

Similar to train, one can modify Configs/testConf.py and run python test.py to evaluate a model.

Acknowledgement

The code of EfficientNet is modified from EfficientNet-PyTorch, where we output multi-scale feature maps.

Citation

If you find this project helpful for your research, please cite our paper

@inproceedings{yan2021primitive,
  author    = {Yan, Ruijie and
               Peng, Liangrui and
               Xiao, Shanyu and
               Yao, Gang},
  title     = {Primitive Representation Learning for Scene Text Recognition},
  booktitle = {CVPR},
  year      = {2021}
}

Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Related tags

Overview

Primitive Representation Learning Network (PREN)

Requirements

Recognition with pretrained model

Training

Evaluation

Acknowledgement

Citation

Owner

Ruijie Yan

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

Learning a mapping from images to psychological similarity spaces with neural networks.

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Contrastive Multi-View Representation Learning on Graphs

PySOT - SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask.

prior-based-losses-for-medical-image-segmentation

Machine Learning Model deployment for Container (TensorFlow Serving)

This is a Tensorflow implementation of Learning to See in the Dark in CVPR 2018

Controlling the MicriSpotAI robot from scratch

Image Data Augmentation in Keras

This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Keras udrl - Keras implementation of Upside Down Reinforcement Learning

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception, IROS 2021

Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and shape estimation at the university of Lincoln

offical implement of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)