Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

Overview

SceneTextPapers

Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized

Information about this repository

This repo serves as a complement to our IJCV paper:

Citing this work

If you find this paper helpful in understanding the latest history of scene text detection&recognition algorithms as well as designing new ones , you are highly encouraged (though not required) to cite our paper

@article{long2020scene,
  title={Scene text detection and recognition: The deep learning era},
  author={Long, Shangbang and He, Xin and Yao, Cong},
  journal={International Journal of Computer Vision},
  pages={1--24},
  year={2020},
  publisher={Springer}
}

Papers

I. Other Survey Papers:

  1. Scene text detection and recognition: Recent advances and future trends. Zhu, Yingying and Yao, Cong and Bai, Xiang. Frontiers of Computer Science, 2016[paper]
  2. Text detection, tracking and recognition in video: A comprehensive survey. Yin, Xu-Cheng and Zuo, Ze-Yu and Tian, Shu and Liu, Cheng-Lin. TIP, 2016 [paper]
  3. Text detection and recognition in imagery: A survey. Ye, Qixiang and Doermann, David. TPAMI, 2015 [paper]
  4. Text localization and recognition in images and video. Uchida, Seiichi. 2014 [paper]

II. Main: Scene Text Detection and Recognition

2.1 Detection

2.1.1 Pipeline Simplification
Anchor-based methods
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  5. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Region proposal methods
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  4. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  5. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  6. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
2.1.2 Differnt Prediction Units
Text instance level
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  4. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  5. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  6. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  7. Deep Direct Regression for Multi-Oriented Scene Text Detection. He, Wenhao and Zhang, Xu-Yao and Yin, Fei and Liu, Cheng-Lin. ICCV, 2017 [paper]
  8. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. Dai, Yuchen and Huang, Zheng and Gao, Yuting and Chen, Kai. 2017 [paper]
  9. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
  10. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
Bottom-up (Pixel)
  1. Scene text detection via holistic, multi-channel prediction. Yao, Cong and Bai, Xiang and Sang, Nong and Zhou, Xinyu and Zhou, Shuchang and Cao, Zhimin. 2016 [paper]
  2. Multi-oriented text detection with fully convolutional networks. Zhang, Zheng and Zhang, Chengquan and Shen, Wei and Yao, Cong and Liu, Wenyu and Bai, Xiang. CVPR, 2016 [paper] [code]
  3. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  4. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  5. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  6. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Bottom-up (Components)
  1. Detecting text in natural image with connectionist text proposal network. Tian, Zhi and Huang, Weilin and He, Tong and He, Pan and Qiao, Yu. ECCV, 2016 [paper] [code]
  2. Aggregating local context for accurate scene text detection. He, Dafang and Yang, Xiao and Huang, Wenyi and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. ACCV, 2016 [paper]
  3. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  4. Scene Text Detection with Novel Superpixel Based Character Candidate Extraction. Wang, Cong and Yin, Fei and Liu, Cheng-Lin. 2017 [paper]
  5. Deep Residual Text Detection Network for Scene Text. Zhu, Xiangyu and Jiang, Yingying and Yang, Shuli and Wang, Xiaobing and Li, Wei and Fu, Pei and Wang, Hua and Luo, Zhenbo. ICDAR, 2017 [paper]
  6. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
2.1.3 Specific Targets
Long text
  1. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
Multi-oriented text
  1. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  5. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  6. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  7. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  8. Geometry-Aware Scene Text Detection With Instance Transformation Network. Wang, Fangfang and Zhao, Liming and Li, Xi and Wang, Xinchao and Tao, Dacheng. CVPR, 2018 [paper] [code]
Irregular text
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  3. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Long, Shangbang and Ruan, Jiaqiang and Zhang, Wenjie and He, Xin and Wu, Wenhao and Yao, Cong. ECCV, 2018 [paper]
  4. Scene Text Detection with Supervised Pyramid Context Network. Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li. AAAI, 2019 [paper]
  5. Learning Shape-Aware Embedding for Scene Text Detection. Zhuotao Tian, Michelle Shu, Pengyuan Lyu, Ruiyu Li, Chao Zhou, Xiaoyong Shen, Jiaya Jia. CVPR, 2019 [paper]
  6. Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation. Xiaobing Wang, Yingying Jiang, Zhenbo Luo, Cheng-Lin Liu, Hyunsoo Choi, Sungjin Kim. CVPR, 2019 [paper]
  7. Towards Robust Curve Text Detection With Conditional Spatial Expansion. Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, Wang Ling Goh. CVPR, 2019 [paper]
  8. Shape Robust Text Detection With Progressive Scale Expansion Network. Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang. CVPR, 2019 [paper]
  9. Character Region Awareness for Text Detection. Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee. CVPR, 2019 [paper]
  10. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. Chengquan Zhang, Borong Liang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding, Xinghao Ding. CVPR, 2019 [paper]
  11. Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network. Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua. ICCV, 2019 [paper]
Speed up
  1. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Easy instance segmentation
  1. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  2. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  3. WordFence: Text Detection in Natural Images with Border Awareness. Polzounov, Andrei and Ablavatski, Artsiom and Escalera, Sergio and Lu, Shijian and Cai, Jianfei. ICIP, 2017 [paper]
  4. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Retrieving designated text
  1. Unambiguous text localization and retrieval for cluttered scenes. Rong, Xuejian and Yi, Chucai and Tian, Yingli. CVPR, 2017 [paper]
Against complex background
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]

2.2 Recognition

2.2.1 CTC based methods
  1. Unconstrained on-line handwriting recognition with recurrent neural networks. Graves, Alex and Liwicki, Marcus and Bunke, Horst and Schmidhuber, Jurgen and Fernandez, Santiago. NIPS, 2008 [paper]
  2. Accurate scene text recognition based on recurrent neural network. Su, Bolan and Lu, Shijian. ACCV, 2014 [paper]
  3. STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition. Liu, Wei and Chen, Chaofeng and Wong, Kwan-Yee K and Su, Zhizhong and Han, Junyu. BMVC, 2016 [paper]
  4. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. Shi, Baoguang and Bai, Xiang and Yao, Cong. TPAMI, 2017 [paper] [code]
  5. Reading Scene Text with Attention Convolutional Sequence Modeling. Gao, Yunze and Chen, Yingying and Wang, Jinqiao and Lu, Hanqing. 2017 [paper],
  6. Scene Text Recognition with Sliding Convolutional Character Models. Yin, Fei and Wu, Yi-Chao and Zhang, Xu-Yao and Liu, Cheng-Lin. 2017 [paper]
2.2.2 Attention based methods
  1. Robust scene text recognition with automatic rectification. Shi, Baoguang and Wang, Xinggang and Lyu, Pengyuan and Yao, Cong and Bai, Xiang. CVPR, 2016 [paper]
  2. Recursive recurrent nets with attention modeling for ocr in the wild. Lee, Chen-Yu and Osindero, Simon. CVPR, 2016 [paper]
  3. Visual attention models for scene text recognition. Ghosh, Suman K and Valveny, Ernest and Bagdanov, Andrew D. ICDAR, 2017 [paper]
  4. Focusing Attention: Towards Accurate Text Recognition in Natural Images. Cheng, Zhanzhan and Bai, Fan and Xu, Yunlu and Zheng, Gang and Pu, Shiliang and Zhou, Shuigeng. ICCV, 2017 [paper]
  5. Learning to Read Irregular Text with Attention Mechanisms. Yang, Xiao and He, Dafang and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. IJCAI, 2017 [paper]
  6. Arbitrarily-Oriented Text Recognition. Cheng, Zhanzhan and Liu, Xuyang and Bai, Fan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2017 [paper]
  7. Edit Probability for Scene Text Recognition., Bai, Fan and Cheng, Zhanzhan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2018 [paper]
  8. SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network. Liu, Zichuan and Li, Yixing and Ren, Fengbo and Yu, Hao and Goh, Wangling. AAAI, 2018 [paper]
  9. Show, attend and read: a simple and strong baseline for recognising irregular text. Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang. AAAI, 2019 [paper]
  10. Scene Text Recognition from Two-Dimensional Perspective. Minghui Liao, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang, Pengyuan Lyu, Cong Yao, Xiang Bai. AAAI, 2019 [paper]
  11. ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification. Fangneng Zhan, Shijian Lu. CVPR, 2019 [paper]
  12. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk. ICCV, 2019 [paper]
  13. Symmetry-Constrained Rectification Network for Scene Text Recognition. Yang, Mingkun and Guan, Yushuo and Liao, Minghui and He, Xin and Bian, Kaigui and Bai, Song and Yao, Cong and Bai, Xiang. ICCV, 2019 [paper]

2.3 End-to-End Text Spotting

2.3.1 Separately Trained Two-Stage Methods
  1. Reading text in the wild with convolutional neural networks. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. IJCV, 2016 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
2.3.2 Jointly Trained Two-Stage Methods
  1. SEE: Towards Semi-Supervised End-to-End Scene Text Recognition. Bartz, Christian and Yang, Haojin and Meinel, Christoph. 2017 [paper] [code]
  2. Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework. Busta, Michal and Neumann, Lukas and Matas, Jiri. ICCV, 2017 [paper] [code]
  3. Towards End-To-End Text Spotting With Convolutional Recurrent Neural Networks. Li, Hui and Wang, Peng and Shen, Chunhua. ICCV, 2017 [paper]
  4. An End-to-End TextSpotter With Explicit Alignment and Attention. He, Tong and Tian, Zhi and Huang, Weilin and Shen, Chunhua and Qiao, Yu and Sun, Changming. CVPR, 2018 [paper]
  5. FOTS: Fast Oriented Text Spotting with a Unified Network. Liu, Xuebo and Liang, Ding and Yan, Shi and Chen, Dagui and Qiao, Yu and Yan, Junjie. CVPR, 2018 [paper]
  6. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  7. Towards Unconstrained End-to-End Text Spotting. Qin, Siyang and Bissacco, Alessandro and Raptis, Michalis and Fujii, Yasuhisa and Xiao, Ying. ICCV, 2019 [paper]
  8. TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting. Feng, Wei and He, Wenhao and Yin, Fei and Zhang, Xu-Yao and Liu, Cheng-Lin. ICCV, 2019 [paper]
  9. Convolutional character networks. Xing, Linjie and Tian, Zhi and Huang, Weilin and Scott, Matthew R. ICCV, 2019 [paper]

2.4 Auxilliary Techs

2.4.1 Synthetic Data
  1. Synthetic data and artificial neural networks for natural scene text recognition. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. NIPS, 2014 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes. Zhan, Fangneng and Lu, Shijian and Xue, Chuhui. ECCV, 2018 [paper] [code]
  4. UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World. Long, Shangbang and Yao, Cong. CVPR, 2020, [paper] [code]
2.4.2 Weak/Semi-Supervision
  1. Wetext: Scene text detection under weak supervision. Tian, Shangxuan and Lu, Shijian and Li, Chongshou. ICCV, 2017 [paper]
  2. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  3. Wordsup: Exploiting word annotations for character based text detection. Hu, Han and Zhang, Chengquan and Luo, Yuxuan and Wang, Yuzhuo and Han, Junyu and Ding, Errui. ICCV, 2018 [paper]
  4. Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning. Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo. ICCV, 2019 [paper]
2.4.3 Deblurring
  1. Convolutional neural networks for direct text deblurring. Hradis, Michal and Kotera, Jan and Zemcik, Pavel and Sroubek, Filip. BMVC, 2015 [paper] [code]
  2. A blind deconvolution model for scene text detection and recognition in video. Khare, Vijeta and Shivakumara, Palaiahnakote and Raveendran, Paramesran and Blumenstein, Michael. PR, 2016 [paper]
2.4.4 Context Information
  1. Could scene context be beneficial for scene text detection? Zhu, Anna and Gao, Renwu and Uchida, Seiichi. PR, 2016 [paper]
2.4.5 Adversarial Attack
  1. Adaptive Adversarial Attack on Scene Text Recognition. Yuan, Xiaoyong and He, Pan and Li, Xiaolin Andy. 2018 [paper]
2.4.6 Evaluation
  1. Tightness-Aware Evaluation Protocol for Scene Text Detection. Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie. CVPR 2019 [paper]

III. Datasets

Dataset (Year) Image Num (train/test) Text Num (train/test) Orientation Language Characteristics Detec/Recog Task
End2End ==== ==== ==== ==== ==== ====
ICDAR03 (2003) 509 (258/251) 2276 (1110/1156) Horizontal En - ✓/✓
ICDAR13 Scene Text(2013) 462 (229/233) - (848/1095) Horizontal En - ✓/✓
ICDAR15 Incidental Text(2015) 1500 (1000/500) - (-/-) Multi-Oriented En Blur, Small, Defocused ✓/✓
ICDAR17 / RCTW (2017) 12263 (8034/4229) - (-/-) Multi-Oriented Cn - ✓/✓
Total-Text (2017) 1555 (1255/300) - (-/-) Multi-Oriented, Curved En, Cn Irregular polygon label ✓/✓
SVT (2010) 350 (100/250) 904 (257/647) Horizontal En - ✓/✓
KAIST (2010) 3000 (-/-) 5000 (-/-) Horizontal En, Ko Distorted ✓/✓
NEOCR (2011) 659 (-/-) 5238 (-/-) Multi-oriented 8 langs - ✓/✓
CUTE (2014) or here 80 (-/80) - (-/-) Curved En - ✓/✓
CTW (2017) 32K ( 25K/6K) 1M ( 812K/205K) Multi-Oriented Cn Fine-grained annotation ✓/✓
CASIA-10K (2018) 10K (7K/3K) - (-/-) Multi-Oriented Cn ✓/✓
Detection Only ==== ==== ==== ==== ==== ====
OSTD (2011) 89 (-/-) 218 (-/-) Multi-oriented En - ✓/-
MSRA-TD500 (2012) 500 (300/200) 1719 (1068/651) Multi-Oriented En, Cn Long text ✓/-
HUST-TR400 (2014) 400 (400/-) - (-/-) Multi-Oriented En, Cn Long text ✓/-
ICDAR17 / RRC-MLT (2017) 18000 (9000/9000) - (-/-) Multi-Oriented 9 langs - ✓/-
CTW1500 (2017) 1500 (1000/500) - (-/-) Multi-Oriented, Curved En Bounding box with 14 vertexes ✓/-
Recognition Only ==== ==== ==== ==== ==== ====
Char74k (2009) 74107 (-/-) 74107 (-/-) Horizontal En, Kannada Character label -/✓
IIIT 5K-Word (2012) 5000 (-/-) 5000 (2000/3000) Horizontal - cropped -/✓
SVHN (2010) - (-/-) 600000 (-/-) Horizontal - House number digits -/✓
SVTP (2013) 639 (-/639) - (-/-) En Distorted -/✓
Owner
Shangbang Long
Shangbang Long
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 03, 2023
Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

Vinícius Azevedo 6 Jun 28, 2022
graph learning code for ogb

The final code for OGB Installation Requirements: ogb=1.3.1 torch=1.7.0 torch-geometric=1.7.0 torch-scatter=2.0.6 torch-sparse=0.6.9 Baseline models T

PierreHao 20 Nov 10, 2022
PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

PyNeuro PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application

Zach Wang 45 Dec 30, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022

Installations for running keras-theano on GPU Upgrade pip and install opencv2 cd ~ pip install --upgrade pip pip install opencv-python Upgrade keras

Berat Kurar Barakat 14 Sep 30, 2022
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
Fully-automated scripts for collecting AI-related papers

AI-Paper-Collector Web demo: https://ai-paper-collector.vercel.app/ (recommended) Colab notebook: here Motivation Fully-automated scripts for collecti

772 Dec 30, 2022
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

Deep Learning and Vision Computing Lab, SCUT 197 Jan 05, 2023
A Vietnamese personal card OCR website built with Django.

Django VietCardOCR Installation Creation of virtual environments is done by executing the command venv: python -m venv venv That will create a new fol

Truong Hoang Thuan 4 Sep 04, 2021
Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

Gabriel Lefundes 9 Nov 12, 2021
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

sushant097 224 Jan 07, 2023
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

isearch is an OSINT tool on Instagram. Offers a face recognition reverse image search on Instagram profile feed photos.

Malek salem 20 Oct 25, 2022
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 06, 2022
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
A simple QR-Code Reader in Python

A simple QR-Code Reader written in Python, that copies the content of a QR-Code directly into the copy clipboard.

Eric 1 Oct 28, 2021
(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

CVMI Lab 224 Dec 28, 2022