TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

Overview

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

Introduction

This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard nonmaximum suppression. For more details, please refer to our paper.

Citing TextBoxes

Please cite TextBoxes in your publications if it helps your research:

@inproceedings{LiaoSBWL17,
  author    = {Minghui Liao and
               Baoguang Shi and
               Xiang Bai and
               Xinggang Wang and
               Wenyu Liu},
  title     = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network},
  booktitle = {AAAI},
  year      = {2017}
}

Contents

  1. Installation
  2. Download
  3. Test
  4. Train
  5. Performance

Installation

  1. Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
git clone https://github.com/MhLiao/TextBoxes.git

cd TextBoxes

make -j8

make py

Download

  1. Models trained on ICDAR 2013: Dropbox link BaiduYun link
  2. Fully convolutional reduced (atrous) VGGNet: Dropbox link BaiduYun link
  3. Compiled mex file for evaluation(for multi-scale test evaluation: evaluation_nms.m): Dropbox link BaiduYun link

Test

  1. Download the ICDAR 2013 DataSet
  2. Download the Models trained on ICDAR 2013
  3. Modify the related paths in the "examples/TextBoxes/test_icdar13.py"
  4. run "python examples/test_icdar13.py"
  5. To multi-scale test, you should use "test_icdar13_multi_scale.py" and "evaluation_nms.m"

Train

  1. Train about 50k iterions on Synthetic data which refered in the paper.
  2. Train about 2k iterions on corresponding training data such as ICDAR 2013 and SVT.
  3. For more information, such as learning rate setting, please refer to the paper.

Performance

  1. Using the given test code, you can achieve an F-measure of about 80% on ICDAR 2013 with a single scale.
  2. Using the given multi-scale test code, you can achieve an F-measure of about 85% on ICDAR 2013 with a non-maximum suppression.
  3. More performance information, please refer to the paper and Task1 and Task4 of Challenge2 on the ICDAR 2015 website: http://rrc.cvc.uab.es/?ch=2&com=evaluation

Please let me know if you encounter any issues.

Owner
zhangjing1
zhangjing1
Super Mario Game With Python

Super_Mario Hello all this is a simple python program which tries to use our body as a controller for the super mario game Here I have used media pipe

Adarsh Badagala 219 Nov 25, 2022
Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

81 Jan 01, 2023
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks? Artifact Detection/Correction - Offcial PyTorch Implementation This rep

CHOI HWAN IL 23 Dec 20, 2022
Neural search engine for AI papers

Papers search Neural search engine for ML papers. Demo Usage is simple: input an abstract, get the matching papers. The following demo also showcases

Giancarlo Fissore 44 Dec 24, 2022
A simple component to display annotated text in Streamlit apps.

Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml

Thiago Teixeira 312 Dec 30, 2022
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2.2k Jan 02, 2023
Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

Vinícius Azevedo 6 Jun 28, 2022
Automatically resolve RidderMaster based on TensorFlow & OpenCV

AutoRiddleMaster Automatically resolve RidderMaster based on TensorFlow & OpenCV 基于 TensorFlow 和 OpenCV 实现的全自动化解御迷士小马谜题 Demo How to use Deploy the ser

神龙章轩 5 Nov 19, 2021
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 05, 2023
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

Learning to Segment Every Thing This repository contains the code for the following paper: R. Hu, P. Dollár, K. He, T. Darrell, R. Girshick, Learning

Ronghang Hu 417 Oct 03, 2022
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
Ocular is a state-of-the-art historical OCR system.

Ocular Ocular is a state-of-the-art historical OCR system. Its primary features are: Unsupervised learning of unknown fonts: requires only document im

228 Dec 30, 2022
Perspective recovery of text using transformed ellipses

unproject_text Perspective recovery of text using transformed ellipses. See full writeup at https://mzucker.github.io/2016/10/11/unprojecting-text-wit

Matt Zucker 111 Nov 13, 2022
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Convert scans of handwritten notes to beautiful, compact PDFs

Convert scans of handwritten notes to beautiful, compact PDFs

Matt Zucker 4.8k Jan 01, 2023
scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.

Scan Tailor - scantailor.org This project is no longer maintained, and has not been maintained for a while. About Scan Tailor is an interactive post-p

1.5k Dec 28, 2022
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require

Jinpeng Zhang 12 Oct 08, 2022