MASTER-PyTorch

PyTorch reimplementation of "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021). This project is different from our original implementation that builds on the privacy codebase FastOCR of the company. You can also find Tensorflow reimplementation at MASTER-TF repository, and the performance is almost identical. (PS. Logo inspired by the Master Oogway in Kung Fu Panda)

Honors based on MASTER

1st place (2020/10) solution to ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard (task2)
2nd and 5th places (2020/10) in The 5th China Innovation Challenge on Handwritten Mathematical Expression Recognition
4th place (2019/08) of ICDAR 2017 Robust Reading Challenge on COCO-Text (task2)
More will be released

Introduction
Requirements
Usage
Customization
- Checkpoints
- Tensorboard Visualization
TODO
Citations
License
Acknowledgements

Introduction

MASTER is a self-attention based scene text recognizer that (1) not only encodes the input-output attention, but also learns self-attention which encodes feature-feature and target-target relationships inside the encoder and decoder and (2) learns a more powerful and robust intermediate representation to spatial distortion and (3) owns a better training and evaluation efficiency. Overall architecture shown follows.

Requirements

python==3.6
torchvision==0.6.1
pandas==1.0.5
torch==1.5.1
numpy==1.16.4
tqdm==4.47.0
Distance==0.1.3
Pillow==7.2.0

pip install -r requirements.txt

Usage

Prepare Datasets

Prepare the correct format of files as provided in data folder.
- Please see data/README.md an instruction how to prepare the data in required format for MASTER.
- Synthetic image datasets: SynthText (Synth800k), MJSynth (Synth90k), SynthAdd (password:627x)
- Real image datasets: IIIT5K, SVT, IC03, IC13, IC15, COCO-Text, SVTP, CUTE80_Cropped
- An example of cropping SynthText can be found at data_utils/crop_synthtext.py
Modify train_dataset and val_dataset args in config.json file, including txt_file, img_root, img_w, img_h.
Modify keys.txt in utils/keys.txt file if needed according to the vocabulary of your dataset.
Modify STRING_MAX_LEN in utils/label_util.py file if needed according to the text length of your dataset.

Distributed training with config files

Modify the configurations in configs/config.json and dist_train.sh files, then run:

bash dist_train.sh

The application will be launched via launch.py on a 4 GPU node with one process per GPU (recommend).

This is equivalent to

python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=4 \
--master_addr=127.0.0.1 --master_port=5555 \
train.py -c configs/config.json -d 1,2,3,4 --local_world_size 4

and is equivalent to specify indices of available GPUs by CUDA_VISIBLE_DEVICES instead of -d args

CUDA_VISIBLE_DEVICES=1,2,3,4 python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=4 \
--master_addr=127.0.0.1 --master_port=5555 \
train.py -c configs/config.json --local_world_size 4

Similarly, it can be launched with a single process that spans all 4 GPUs (if node has 4 available GPUs) using (don't recommend):

CUDA_VISIBLE_DEVICES=1,2,3,4 python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=1 \
--master_addr=127.0.0.1 --master_port=5555 \
train.py -c configs/config.json --local_world_size 1

Using Multiple Node

You can enable multi-node multi-GPU training by setting nnodes and node_rank args of the commandline line on every node. e.g., 2 nodes 4 gpus run as follows

Node 1, ip: 192.168.0.10, then run on node 1 as follows

CUDA_VISIBLE_DEVICES=1,2,3,4 python -m torch.distributed.launch --nnodes=2 --node_rank=0 --nproc_per_node=4 \
--master_addr=192.168.0.10 --master_port=5555 \
train.py -c configs/config.json --local_world_size 4

Node 2, ip: 192.168.0.15, then run on node 2 as follows

CUDA_VISIBLE_DEVICES=2,4,6,7 python -m torch.distributed.launch --nnodes=2 --node_rank=1 --nproc_per_node=4 \
--master_addr=192.168.0.10 --master_port=5555 \
train.py -c configs/config.json --local_world_size 4

Debug mode on one GPU/CPU training with config files

This option of training mode can debug code without distributed way. -dist must set to false to turn off distributed mode. -d specify which one gpu will be used.

python train.py -c configs/config.json -d 1 -dist false

Resuming from checkpoints

You can resume from a previously saved checkpoint by:

python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=4 \
--master_addr=127.0.0.1 --master_port=5555 \
train.py -d 1,2,3,4 --local_world_size 4 --resume path/to/checkpoint

Finetune from checkpoints

You can finetune from a previously saved checkpoint by:

python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=4 \
--master_addr=127.0.0.1 --master_port=5555 \
train.py -d 1,2,3,4 --local_world_size 4 --resume path/to/checkpoint --finetune true

Testing from checkpoints

You can predict from a previously saved checkpoint by:

python test.py --checkpoint path/to/checkpoint --img_folder path/to/img_folder \
               --width 160 --height 48 \
               --output_folder path/to/output_folder \
               --gpu 0 --batch_size 64

Note: width and height must be the same as the settings used during training.

Evaluation

Evaluate squence accuracy and edit distance accuracy:

python utils/calculate_metrics.py --predict-path predict_result.json --label-path label.txt

Note: label.txt: multi-line, every line containing {ImageFile:<ImageFile>, Label:<TextLabel>}

Customization

Checkpoints

You can specify the name of the training session in config.json files:

"name": "MASTER_Default",
"run_id": "example"

The checkpoints will be saved in save_dir/name/run_id_timestamp/checkpoint_epoch_n, with timestamp in mmdd_HHMMSS format.

A copy of config.json file will be saved in the same folder.

Note: checkpoints contain:

{
  'arch': arch,
  'epoch': epoch,
  'model_state_dict': self.model.state_dict(),
  'optimizer': self.optimizer.state_dict(),
  'monitor_best': self.monitor_best,
  'config': self.config
}

Tensorboard Visualization

This project supports Tensorboard visualization by using either torch.utils.tensorboard or TensorboardX.

Install

If you are using pytorch 1.1 or higher, install tensorboard by 'pip install tensorboard>=1.14.0'.

Otherwise, you should install tensorboardx. Follow installation guide in TensorboardX.
Run training

Make sure that tensorboard option in the config file is turned on.
```
 "tensorboard" : true
```
Open Tensorboard server

Type tensorboard --logdir saved/log/ at the project root, then server will open at http://localhost:6006

By default, values of loss will be logged. If you need more visualizations, use add_scalar('tag', data), add_image('tag', image), etc in the trainer._train_epoch method. add_something() methods in this project are basically wrappers for those of tensorboardX.SummaryWriter and torch.utils.tensorboard.SummaryWriter modules.

Note: You don't have to specify current steps, since WriterTensorboard class defined at logger/visualization.py will track current steps.

TODO

Memory-Cache based Inference

Citations

If you find MASTER useful please cite our paper:

@article{Lu2021MASTER,
  title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
  author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
  journal={Pattern Recognition},
  year={2021}
}

License

This project is licensed under the MIT License. See LICENSE for more details.

Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

Related tags

Overview

MASTER-PyTorch

Honors based on MASTER

Contents

Introduction

Requirements

Usage

Prepare Datasets

Distributed training with config files

Using Multiple Node

Debug mode on one GPU/CPU training with config files

Resuming from checkpoints

Finetune from checkpoints

Testing from checkpoints

Evaluation

Customization

Checkpoints

Tensorboard Visualization

TODO

Citations

License

Acknowledgements

Owner

Wenwen Yu

Unofficial Pytorch Implementation of WaveGrad2

PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

sense-py-AnishaBaishya created by GitHub Classroom

SysWhispers Shellcode Loader

A powerful framework for decentralized federated learning with user-defined communication topology

This repository accompanies the ACM TOIS paper "What can I cook with these ingredients?" - Understanding cooking-related information needs in conversational search

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Open-source code for Generic Grouping Network (GGN, CVPR 2022)

LyaNet: A Lyapunov Framework for Training Neural ODEs

Deep Watershed Transform for Instance Segmentation

Pull sensitive data from users on windows including discord tokens and chrome data.

The toolkit to generate auto labeled datasets

[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

IhoneyBakFileScan Modify - 批量网站备份文件扫描器，增加文件规则，优化内存占用