Awesome Scene Text

A curated list of papers and resources for scene text detection and recognition

The year when a paper was first published, including ArXiv publications, is used. As a result, there may be cases when a paper was accepted for example to CVPR 2019, but it is listed in year 2018 because it was published in 2018 on ArXiv.

Table of contents
1. Scene Text Detection
2. Weakly Supervised Scene Text Detection
3. Scene Text Recognition
4. Other scene text papers
5. Scene Text Survey papers

Scene Text Detection (including methods for end-to-end detection and recognition)

2010

Detecting text in natural scenes with stroke width transform [CVPR 2010] [paper]
A Method for Text Localization and Recognition in Real-World Images [ACCV 2010] [paper]

2011

2012

Real-time scene text localization and recognition [CVPR 2012] [paper]

2013

2014

Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [ECCV 2014] [paper]

2015

Symmetry-based text line detection in natural scenes [CVPR 2015] [paper]
Object proposals for text extraction in the wild [ICDAR 2015] [paper]
Text-Attentional Convolutional Neural Network for Scene Text Detection [TIP 2016] [paper]
Text Flow : A Unified Text Detection System in Natural Scene Images [ICCV 2015] [paper]

2016

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network [ArXiv] [paper]
Multi-Oriented Text Detection With Fully Convolutional Networks [CVPR 2016] [paper]
Scene Text Detection Via Holistic, Multi-Channel Prediction [ArXiv] [paper]
Detecting Text in Natural Image with Connectionist Text Proposal Network [ECCV 2016] [paper]
TextBoxes: A Fast Text Detector with a Single Deep Neural Network [AAAI 2017] [paper]
- https://github.com/MhLiao/TextBoxes [Caffe]
- https://github.com/shinjayne/shinTB [TF]

2017

Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting In The Wild [CVPR 2017] [paper]
Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework [ICCV 2017] [paper]
Arbitrary-Oriented Scene Text Detection via Rotation Proposals [TMM 2018] [paper]
- https://github.com/mjq11302010044/RRPN [Caffe]
Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [CVPR 2017] [paper]
Detecting Oriented Text in Natural Images by Linking Segments [CVPR 2017] [paper]
- https://github.com/bgshih/seglink [TF]
- https://github.com/dengdan/seglink [TF]
Deep Direct Regression for Multi-Oriented Scene Text Detection [ICCV 2017] [paper]
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [ArXiv] [paper]
EAST: An Efficient and Accurate Scene Text Detector [CVPR 2017] [paper]
- https://github.com/argman/EAST [TF]
- https://github.com/kurapan/EAST [Keras]
WordFence: Text Detection in Natural Images with Border Awareness [ICIP 2017] [paper]
R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection [ArXiv] [paper]
- https://github.com/DetectionTeamUCAS/R2CNN_Faster-RCNN_Tensorflow [TF]
- https://github.com/beacandler/R2CNN [Caffe]
WordSup: Exploiting Word Annotations for Character based Text Detection [ICCV 2017] [paper]
Single Shot Text Detector With Regional Attention [ICCV 2017] [paper]
- https://github.com/BestSonny/SSTD [Caffe]
- https://github.com/HotaekHan/SSTDNet [PyTorch]
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection [ArXiv] [paper]
Deep Residual Text Detection Network for Scene Text [ICDAR 2017] [paper]
Feature Enhancement Network: A Refined Scene Text Detector [AAAI 2018] [paper]
ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene [ArXiv] [paper]
Self-organized Text Detection with Minimal Post-processing via Border Learning [ICCV 2017] [paper]
- https://gitlab.com/rex-yue-wu/ISI-PPT-Text-Detector [Keras]

2018

PixelLink: Detecting Scene Text via Instance Segmentation [AAAI 2018] [paper]
- https://github.com/ZJULearning/pixel_link [TF]
- https://github.com/BowieHsu/tensorflow_ocr [TF]
FOTS: Fast Oriented Text Spotting With a Unified Network [CVPR 2018] [paper]
TextBoxes++: A Single-Shot Oriented Scene Text Detector [TIP 2018] [paper]
- https://github.com/MhLiao/TextBoxes_plusplus [Caffe]
Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation [CVPR 2018] [paper]
An end-to-end TextSpotter with Explicit Alignment and Attention [CVPR 2018] [paper]
- https://github.com/tonghe90/textspotter [Caffe]
Rotation-Sensitive Regression for Oriented Scene Text Detection [CVPR 2018] [paper]
- https://github.com/MhLiao/RRD [Caffe]
Detecting multi-oriented text with corner-based region proposals [Neurocomputing 2019] [paper]
- https://github.com/xhzdeng/crpn [Caffe]
An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches [ArXiv] [paper]
IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection [IJCAI 2018] [paper]
- https://github.com/xieyufei1993/InceptText-Tensorflow [TF]
Shape Robust Text Detection with Progressive Scale Expansion Network [CVPR 2019] [paper] [paper v2]
- https://github.com/whai362/PSENet [PyTorch]
- https://github.com/liuheng92/tensorflow_PSENet [TF]
TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes [ECCV 2018] [paper]
- https://github.com/princewang1994/TextSnake.pytorch [PyTorch]
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [ECCV 2018] [paper]
- https://github.com/lvpengyuan/masktextspotter.caffe2 [Caffe2]
Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping [ECCV 2018] [paper]
A New Anchor-Labeling Method For Oriented Text Detection Using Dense Detection Framework [SPL 2018] [paper]
An Efficient System for Hazy Scene Text Detection using a Deep CNN and Patch-NMS [ICPR 2018] [paper]
Scene Text Detection with Supervised Pyramid Context Network [AAAI 2019] [paper]
Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [ArXiv] [paper]
Mask R-CNN with Pyramid Attention Network for Scene Text Detection [WACV 2019] [paper]
TextMountain: Accurate Scene Text Detection via Instance Segmentation [ArXiv] [paper]
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [ArXiv] [paper]
TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network [ACCV 2018] [paper]

2019

MSR: Multi-Scale Shape Regression for Scene Text Detection [IJCAI 2019] [paper]
Scene Text Detection with Inception Text Proposal Generation Module [ICMLC 2019] [paper]
Towards Robust Curve Text Detection with Conditional Spatial Expansion [CVPR 2019] [paper]
Curve Text Detection with Local Segmentation Network and Curve Connection [ArXiv] [paper]
Pyramid Mask Text Detector [ArXiv] [paper]
Tightness-aware Evaluation Protocol for Scene Text Detection [CVPR 2019] [paper]
Character Region Awareness for Text Detection [CVPR 2019] [paper]
Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [CVPR 2019] [paper]
TextCohesion: Detecting Text for Arbitrary Shapes [ArXiv] [paper]
Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation [CVPR 2019] [paper]
Learning Shape-Aware Embedding for Scene Text Detection [CVPR 2019] [paper]
A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning [ACMMM 2019] [paper]
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network [ICCV 2019] [paper]
Towards Unconstrained End-to-End Text Spotting [ICCV 2019] [paper]
TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting [paper]
Convolutional Character Networks [ICCV 2019] [paper]

Weakly supervised Scene Text Detection & Recognition

2017

Attention-Based Extraction of Structured Information from Street View Imagery [ICDAR 2017] [paper]
WeText: Scene Text Detection under Weak Supervision [ICCV 2017] [paper]
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [AAAI 2018] [paper]
- https://github.com/Bartzi/see [Chainer]

Scene Text Recognition

2014

Deep Structured Output Learning for Unconstrained Text Recognition [ICLR 2015] [paper]
- https://github.com/AlexandreSev/Structured_Data [TF]
Reading text in the wild with convolutional neural networks [IJCV 2016] [paper]
- https://github.com/mathDR/reading-text-in-the-wild [Keras]

2015

Reading Scene Text in Deep Convolutional Sequences [AAAI 2016] [paper]
An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition [TPAMI 2017] [paper]
- https://github.com/bgshih/crnn [Torch]
- https://github.com/weinman/cnn_lstm_ctc_ocr [TF]
- https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow [TF]
- https://github.com/MaybeShewill-CV/CRNN_Tensorflow [TF]
- https://github.com/meijieru/crnn.pytorch [PyTorch]
- https://github.com/kurapan/CRNN [Keras]

2016

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [CVPR 2016] [paper]
Robust scene text recognition with automatic rectification [CVPR 2016] [paper]
- https://github.com/WarBean/tps_stn_pytorch [PyTorch]
- https://github.com/marvis/ocr_attention [PyTorch]
CNN-N-Gram for Handwriting Word Recognition [CVPR 2016] [paper]
STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition [BMVC 2016] [paper]

2017

STN-OCR: A single Neural Network for Text Detection and Text Recognition [ArXiv] [paper]
- https://github.com/Bartzi/stn-ocr [MXNet]
Learning to Read Irregular Text with Attention Mechanisms [IJCAI 2017] [paper]
Scene Text Recognition with Sliding Convolutional Character Models [ArXiv] [paper]
Focusing Attention: Towards Accurate Text Recognition in Natural Images [ICCV 2017] [paper]
AON: Towards Arbitrarily-Oriented Text Recognition [CVPR 2018] [paper]
- https://github.com/huizhang0110/AON [TF]
Gated Recurrent Convolution Neural Network for OCR [NIPS 2017] [paper]
- https://github.com/Jianfeng1991/GRCNN-for-OCR [Torch]

2018

Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition [AAAI 2018] [paper]
SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network [AAAI 2018] [paper]
Edit Probability for Scene Text Recognition [CVPR 2018] [paper]
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification [TPAMI 2018] [paper]
- https://github.com/bgshih/aster [TF]
Synthetically Supervised Feature Learning for Scene Text Recognition [ECCV 2018] [paper]
Scene Text Recognition from Two-Dimensional Perspective [AAAI 2019] [paper]
ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification [CVPR 2019] [paper]

2019

A Multi-Object Rectified Attention Network for Scene Text Recognition [Pattern Recognition] [paper]
- https://github.com/Canjie-Luo/MORAN_v2 [PyTorch]
A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition [paper]
Aggregation Cross-Entropy for Sequence Recognition [CVPR 2019][paper]
- https://github.com/summerlvsong/Aggregation-Cross-Entropy [PyTorch]
Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition [CVPR 2019][paper]
2D Attentional Irregular Scene Text Recognizer [ArXiv] [paper]
Deep Neural Network for Semantic-based Text Recognition in Images [ArXiv] [paper]
Symmetry-constrained Rectification Network for Scene Text Recognition [ICCV 2019] [paper]
Rethinking Irregular Scene Text Recognition (ICDAR 2019-ArT) [paper]
- https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy [PyTorch]
Focus-Enhanced Scene Text Recognition with Deformable Convolutions [ArXiv] [paper]
- https://github.com/Alpaca07/dtr [PyTorch]
Adaptive Embedding Gate for Attention-Based Scene Text Recognition [ArXiv] [paper]

Script Identification

Other scene text related papers

2016

Synthetic Data for Text Localisation in Natural Images [CVPR 2016] [paper]
- https://github.com/ankush-me/SynthText

2019

Scene Text Synthesis for Efficient and Effective Deep Network Training [ArXiv] [paper]

Scene text survey

2018

Scene Text Detection and Recognition: The Deep Learning Era [ArXiv] [paper]

2019

Scene text detection and recognition with advances in deep learning: a survey [IJDAR 2019] [paper]

A curated list of papers and resources for scene text detection and recognition

Related tags

Overview

Awesome Scene Text

Scene Text Detection (including methods for end-to-end detection and recognition)

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

Weakly supervised Scene Text Detection & Recognition

2017

Scene Text Recognition

2014

2015

2016

2017

2018

2019

Script Identification

Other scene text related papers

2016

2019

Scene text survey

2018

2019

Owner

Jan Zdenek

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Using python libraries to track hands

Introduction to image processing, most used and popular functions of OpenCV

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

a Deep Learning Framework for Text

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

The CIS OCR PostCorrectionTool

Maze generator and solver with python

pyntcloud is a Python library for working with 3D point clouds.

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Opencv face recognition desktop application

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

A Python script to capture images from multiple webcams at once and save them into your local machine