Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Last update: Oct 10, 2022

Overview

Scene Text-Spotting based on PSEnet+CRNN

Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We plan to grow this repository into an open research platform for multi-lingual text detection and recognition from natural scene images, targeted towards low-resource languages.

Requirements

Python 3.6.5
Pytorch 1.2
pyclipper
Polygon 3.0.8
OpenCV 3.4.1

Demo

Download the trained CRNN and PSEnet models from the links provided below.
Copy paths of the models and paste them in params.py
run end-end.py

python end-end.py --img [path to image] --e2e_config_name [end to end config name]

Pre-trained Models

Both PSEnet and CRNN pre-trained models can be found here: gdrive

the PSEnet model is a multi-lingual text detector, trained on MLT 2019. Works quite well!
the CRNN recognizes Hindi, Bangla, Malayalam, Kanada, Tamil, Telugu, Odia, Sanskrit, Marathi!

Download the models in models/ directory and modify params.py if required.

Training instructions

To train your own detection model refer to this file.
To train your own recognition model refer to this file.

Samples

Contributors

Azhar Shaikh, PES University LinkedIn
Nishant Sinha, OffNote Labs

Work done as part of Internship with OffNote Labs.

References

If this repository helps you, please star it. Thank you!

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Related tags

Overview

Scene Text-Spotting based on PSEnet+CRNN

Requirements

Demo

Pre-trained Models

Training instructions

Samples

Contributors

References

Owner

azhar shaikh

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Learning Camera Localization via Dense Scene Matching, CVPR2021

pulse2percept: A Python-based simulation framework for bionic vision

Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

Zoom , GoogleMeets에서 Vtuber 데뷔하기

Creating of virtual elements of the graphical interface using opencv and mediapipe.

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Convert scans of handwritten notes to beautiful, compact PDFs

Image processing in Python

Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

Converts an image into funny, smaller amongus characters

Ddddocr - 通用验证码识别OCR pypi版

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Creating a virtual tv using opencv in python3.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Characterizing possible failure modes in physics-informed neural networks.