Scene-Text-Understanding

Survey

[2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
[2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper
[2020-Arxiv] Text Recognition in the Wild: A surveypaper

Scene Text Detection

[2019-CVPR] Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation [paper]
[2019-CVPR] A Multitask Network for Localization and Recognition of Text in Images(end-to-end) [paper]
[2019-CVPR] AFDM: Handwriting Recognition in Low-resource Scripts using Adversarial Learning(data augmentation) [paper] [code]
[2019-CVPR] CRAFT: Character Region Awareness for Text Detection [paper] [code]
[2019-CVPR] Data Extraction from Charts via Single Deep Neural Network(*) [paper]
[2019-CVPR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
[2019-arXiv] FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition [paper]
[2019-CVPR] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [paper]
[2019-CVPR] PSENET: Shape Robust Text Detection with Progressive Scale Expansion Network [paper][tensorflow][Pytorch]
[2019-CVPR] PMTD: Pyramid Mask Text Detector [paper] [code]
[2019-CVPR] Spatial Fusion GAN for Image Synthesis (word Synthesis) [[paper]](https://arxiv.org/abs/1812.05840 [code]
[2019-CVPR] Scene Text Detection with Supervised Pyramid Context Network [paper][keras]
[2019-arXiv] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [paper] [code]
[2019-CVPR] Typography with Decor: Intelligent Text Style Transfer [paper] [code]
[2019-CVPR] TIOU: Tightness-aware Evaluation Protocol for Scene Text Detection(new Evalution tool)[paper] [code]
[2019-arXiv] MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition [paper] [code]
[2019-CVPR] Scene Text Magnifier [paper]
[2018-CVPR] Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [paper]
[2018-ECCV] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [paper] [code]
[2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation [paper] [code]
[2018-CVPR] RRPN: Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper] [code]
[2018-CPVR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [Paper]
[2018-arxiv] PixelLink: Detecting Scene Text via Instance Segmentation [Paper]
[2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [Paper]
[2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector[Paper]
[2017-arxiv] Attention-based Extraction of Structured [Paper]
[2017-ICCV]Single Shot TextDetector with Regional Attention [Paper]
[2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
[2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[Paper]
[2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [Paper] [Code]
[2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[Paper]
[2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection [Paper]
[2017-CVPR]Detecting oriented text in natural images by linking segments [Paper]
[2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
[2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
[2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[Paper][Code]
[2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [Paper]
[2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [Paper] [Data]
[2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]
[2016-arXiv] Scene Text Detection via Holistic, Multi-Channel Prediction [Paper]
[2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [Paper]
[2016-CVPR]Synthetic Data for Text Localisation in Natural Images[Paper] [Data] [Code]
[2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[Paper] [Demo][Code]
[2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection[Paper]
[2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[Paper]
[2016-CVPR]Multi-oriented text detection with fully convolutional networks[Paper]
[2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition
[2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes
[2015-ICCV]FASText: Efficient unconstrained scene text detector [Paper] https://github.com/MichalBusta/FASText
[2015-D.PhilThesis] Deep Learning for Text Spotting [Paper]
[2015 ICDAR]Object Proposals for Text Extraction in the Wild [Paper] https://github.com/lluisgomez/TextProposals
[2014-ECCV] Deep Features for Text Spotting [Paper] https://bitbucket.org/jaderberg/eccv2014_textspotting https://bitbucket.org/jaderberg/eccv2014_textspotting http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting
[2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [Paper] http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html https://github.com/almazan/watts
[2014-TPRMI]Robust Text Detection in Natural Scene Images
[2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [Paper]
[2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions [Paper]
[2012-CVPR]Real-time scene text localization and recognition [Paper]
[2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [Paper]

Scene Text Recognition

[2019-CVPR] ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification [paper] [code] [code]
[2019-CVPR] E2E-MLT: an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
[2018-CVPR] FOTS: Fast [paper]
[2017-ICCV] WeText: Scene Text Detection under Weak Supervision [Paper]
[2017-ICCV] Single Shot Text Detector with Regional Attention [Paper] [Code]
[2017-ICCV] Self-organized Text Detection with Minimal Post-processing via Border Learning [Paper]
[2017-ICCV] Focusing Attention: Towards Accurate Text Recognition in Natural Images [Paper]
[2017-ICCV] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks [Paper]
[2017-CVPR] Unambiguous Text Localization and Retrieval for Cluttered Scenes [Paper]
[2017-ICCV] WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
[2017-ICCV] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [Paper] [Code]
[2017-arXiv] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [Paper]
[2017-AAAI] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [Paper] [Code]
[2017-arXiv] Improving Text Proposal for Scene Images with Fully Convolutional Networks [Paper]
[2017-AAAI] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [Paper] [Code] github 代码
[2017-CVPR] Detecting Oriented Text in Natural Images by Linking Segments [Paper]
[2017-arXiv] Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
[2017-CVPR] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
[2016-arXiv] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images [Paper]
[2017-arvix ] Full-Page TextRecognition : Learning Where to Start and When to Stop https://arxiv.org/pdf/1704.08628.pdf
[2016-AAAI]Reading Scene Text in Deep Convolutional Sequences [Paper]
[2016-IJCV]Reading Text in the Wild with Convolutional Neural Networks [Paper] http://zeus.robots.ox.ac.uk/textsearch/#/search/ http://www.robots.ox.ac.uk/~vgg/research/text
[2016-CVPR]Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [Paper]
[2016-CVPR] Robust Scene Text Recognition with Automatic Rectification [Paper]
[2016-NIPs] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [Paper]
[2015-CoRR] AnEnd-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [Paper] https://github.com/bgshih/crnn
[2015-ICDAR]Automatic Script Identification in the Wild [Paper]
[2015-ICLR] Deep structured output learning for unconstrained text recognition [Paper]
[2014-NIPS]Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [Paper] http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14c/ http://www.robots.ox.ac.uk/~vgg/research/text/model_release.tar.gz
[2014-TIP] A Unified Framework for Multi-Oriented Text Detection and Recognition
[2012-ICPR]End-to-End Text Recognition with Convolutional Neural Networks [Paper] http://cs.stanford.edu/people/twangcat/ICPR2012_code/SceneTextCNN_demo.tar http://ufldl.stanford.edu/housenumbers/

Phd Thesis

[2016-PhD Thesis] Context Modeling for Semantic Text Matching and Scene Text Detection [Paper]
[2015-PhD Thesis] Deep Learning for Text Spotting [Paper]
[2012-PhD thesis] End-to-End Text Recognition with Convolutional Neural Networks [Paper]

Text Detection

[2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector [Paper]

Dataset

PowerPoint Text Detection and Recognition Dataset 2017

COCO-Text (ComputerVision Group, Cornell) 2016

63,686images, 173,589 text instances, 3 fine-grained text attributes.
Task:text location and recognition

COCO-Text API

Synthetic Data for Text Localisation in Natural Image (VGG)2016

800k thousand images
8 million synthetic word instances
download

Synthetic Word Dataset (Oxford, VGG) 2014

9million images covering 90k English words
Task:text recognition, segmentation
download

IIIT 5K-Words 2012

5000images from Scene Texts and born-digital (2k training and 3k testing images)
Eachimage is a cropped word image of scene text with case-insensitive labels
Task:text recognition
download

StanfordSynth(Stanford, AI Group) 2012

Small single-character images of 62 characters (0-9, a-z, A-Z)
Task:text recognition
download

MSRA Text Detection 500 Database(MSRA-TD500) 2012

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
Chinese,English or mixture of both
Task:text detection

Street View Text (SVT) 2010

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
Only word level bounding boxes are provided with case-insensitive labels
Task:text location

KAIST Scene_Text Database 2010

3000 images of indoor and outdoor scenes containing text
Korean,English (Number), and Mixed (Korean + English + Number)
Task:text location, segmentation and recognition

Chars74k 2009

Over 74K images from natural images, as well as a set of synthetically generatedcharacters
Smallsingle-character images of 62 characters (0-9, a-z, A-Z)
Task:text recognition
ICDAR Benchmark Datasets

Dataset	Discription	Competition Paper
ICDAR 2017	42618 training images and 9837 testing images	`paper`
ICDAR 2015	1000 training images and 500 testing images	`paper`
ICDAR 2013	229 training images and 233 testing images	`paper`
ICDAR 2011	229 training images and 255 testing images	`paper`
ICDAR 2005	1001 training images and 489 testing images	`paper`
ICDAR 2003	181 training images and 251 testing images(word level and character level)	`paper`

Blogs

Online Service

Name	Description
Online OCR	API，Free
Free OCR	API，Free
New OCR	API，Free
ABBYY FineReader Online	nonAPI，free

Open Resources Code

本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别 [code]
超轻量级中文ocr，支持竖排文字识别, 支持ncnn推理 , psenet(8.5M) + crnn(6.3M) + anglenet(1.5M) 总模型仅17M [code]
Tesseract c++ based tools for documents analysis and OCR [code]
Ocropy: Python-based tools for document analysis and OCR https://github.com/tmbdev/ocropy
CLSTM A small implementation of LSTM networks,focused on OCR https://github.com/tmbdev/clstm
Convolutional Recurrent Neural Network Torch7 https://github.com/bgshih/crnn
Attention-OCR Visual Attention based OCR https://github.com/da03/Attention-OCR
Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm https://github.com/edward-zhu/umaru
AKSHAYUBHAT/DeepVideoAnalytics (CTPN+CRNN) code
ankush-me/SynthText code
JarveeLee/SynthText_Chinese_version code

Hand Writing Recognition

[2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network https://arxiv.org/abs/1606.06539
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition https://arxiv.org/abs/1610.02616
Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition https://arxiv.org/abs/1610.04057
High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps http://arxiv.org/abs/1505.04925">
DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) https://github.com/chongyangtao/DeepHCCR">
Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention http://arxiv.org/abs/1604.03286
MLPaint:the Real-Time Handwritten Digit Recognizer http://blog.mldb.ai/blog/posts/2016/09/mlpaint/
caffe-ocr: OCR with caffe deep learning framework https://github.com/pannous/caffe-ocr

Licence Tag Recognition

ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs
Numberplate recognition with Tensorflow http://matthewearl.github.io/2016/05/06/cnn-anpr/
end-to-end-for-plate-recognition href="https://github.com/szad670401/end-to-end-for-chinese-plate-recognitionbhttp://rnd.azoft.com/applying-ocr-technology-receipt-recognition/

OCR, Scene-Text-Understanding, Text Recognition

Related tags

Overview

Scene-Text-Understanding

Survey

Scene Text Detection

Scene Text Recognition

Phd Thesis

Text Detection

Dataset

Blogs

Online Service

Open Resources Code

Hand Writing Recognition

Licence Tag Recognition

Owner

Alan Tang

kaldi-asr/kaldi is the official location of the Kaldi project.

Binarize document images

virtual mouse which can copy files, close tabs and many other features !

make a better chinese character recognition OCR than tesseract

Assignment work with webcam

Rotational region detection based on Faster-RCNN.

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

The CIS OCR PostCorrectionTool

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Virtual Zoom Gesture using OpenCV

Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

Characterizing possible failure modes in physics-informed neural networks.

CNN+Attention+Seq2Seq

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

A python screen recorder for low-end computers, provides high quality video output.

Erosion and dialation using structure element in OpenCV python

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

Text page dewarping using a "cubic sheet" model