This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

Overview

EAST: An Efficient and Accurate Scene Text Detector

Description:

This version will be updated soon, please pay attention to this work. The motivation of this version is to build a easy-training model. This version can automatically update best_model by comparing current hmean and the former. At the same time, we can see evaluation info about every sample easily.

  • 1.train
  • 2.predict
  • 3.compress
  • 4.compute Hmean(if Hmean is higher than before, update best_weight.pkl)
  • 5.visualization(blue, green, red)
  • 6.multi-scale test (update soon) multi-scale vis. (vis with score, scales)

Thanks

The version is ported from argman/EAST, from Tensorflow to Pytorch

Check On Website

If you have no confidence of the result of our program, you could use submit.zip to submit on website,then you can see result of every image.

Performance

  • right -- green || wrong -- red || miss -- blue visualization visualization

  • recall/precision/hmean for every test image hmean

Introduction

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector. The features are summarized blow:

  • Only RBOX part is implemented.
  • A fast Locality-Aware NMS in C++ provided by the paper's author.(g++/gcc version 6.0 + will be ok)
  • Evalution see here for the detailed results.
  • Differences from original paper
    • Use ResNet-50 rather than PVANET
    • Use dice loss (optimize IoU of segmentation) rather than balanced cross entropy
    • Use linear learning rate decay rather than staged learning rate decay

Thanks for the author's (@zxytim) help! Please cite his paper if you find this useful.

Contents

  1. Installation
  2. Download
  3. Prepare dataset/pretrain
  4. Test
  5. Train
  6. Examples

Installation

  1. Any version of pytorch version > 0.4.0 should be ok.

Download

  1. Pretrained model is not provided temporarily. Web site is updating now, please continue to pay attention

Prepare dataset/pretrain weight

[1]. dataset(you need to prepare for dataset for train and test) suggestions: you could do a soft-link to root_to_this_program/dataset/train/img/*.jpg

  • -- train ./dataset/train/img/img_###.jpg ./dataset/train/gt/img_###.txt (you need to change name)
  • -- test ./data/test/img_###.jpg (img only)
  • -- gt.zip ./result/gt.zip(ICDAR15 gt.zip is avaliable on website

** Note: you can download dataset here

[2]. pretrained

  • In config.py set resume True and set checkpoint path/to/weight/file
  • I will provide pretrianed weight soon

[3]. check GPUs and CPUs you can use following to check aviliable gpu, this is for train

watch -n 0.1 nvidia-smi

then, you will see 2,3 is avaliable, modify config.py gpu_ids = [0,1], gpu = 2, and modify run.sh - CUDA_VISIBLE_DEVICES=2,3

Train

If you want to train the model, you should provide the dataset path in config.py and run

sh run.py

** Note: you should modify run.sh to specify your gpu id

If you have more than one gpu, you can pass gpu ids to gpu_list(like gpu_list=0,1,2,3) in config.py

** Note: you should change the gt text file of icdar2015's filename to img_*.txt instead of gt_img_*.txt(or you can change the code in icdar.py), and some extra characters should be removed from the file. See the examples in training_samples/**

Test

By default, we set train-eval process into integer. If you want to use eval independently, you can do it by yourself. Any question can contact me.

Examples

Here are some test examples on icdar2015, enjoy the beautiful text boxes! image_1 image_2 image_3 image_4 image_5

Owner
Dejia Song
Computer Vision & Machine Learning
Dejia Song
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

Learning to Segment Every Thing This repository contains the code for the following paper: R. Hu, P. Dollár, K. He, T. Darrell, R. Girshick, Learning

Ronghang Hu 417 Oct 03, 2022
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.

PassportScanner Works with 2 and 3 line identity documents. What is this With PassportScanner you can use your camera to scan the MRZ code of a passpo

Edwin Vermeer 441 Dec 24, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
📷 This repository is focused on having various feature implementation of OpenCV in Python.

📷 This repository is focused on having various feature implementation of OpenCV in Python. The aim is to have a minimal implementation of all OpenCV features together, under one roof.

Aditya Kumar Gupta 128 Dec 04, 2022
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching ([email protected]) Pytorch implementation of paper "Boosting Co-tea

YINGYI CHEN 41 Jan 03, 2023
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

SEAM The implementation of Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentaion. You can also download the repos

Hibercraft 459 Dec 26, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022
ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

ScanTailor Advanced The ScanTailor version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and f

952 Dec 31, 2022
Smart computer vision application

Smart-computer-vision-application Backend : opencv and python Library required:

2 Jan 31, 2022
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 541 Dec 28, 2022
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Train custom VR face tracking parameters

Pal Buddy Guy: The anipal's best friend This is a small script to improve upon the tracking capabilities of the Vive Pro Eye and facial tracker. You c

7 Dec 12, 2021
Character Segmentation using TensorFlow

Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment

26 Aug 25, 2022
The Open Source Framework for Machine Vision

SimpleCV Quick Links: About Installation [Docker] (#docker) Ubuntu Virtual Environment Arch Linux Fedora MacOS Windows Raspberry Pi SimpleCV Shell Vid

Sight Machine 2.6k Dec 31, 2022