Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Last update: Dec 28, 2022

Overview

CRAFT: Character-Region Awareness For Text detection

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

Overview

PyTorch implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

Getting started

Installation

Install using conda for Linux, Mac and Windows (preferred):

conda install -c fcakyon craft-text-detector

Install using pip for Linux and Mac:

pip install craft-text-detector

Basic Usage

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

Advanced Usage

# import craft functions
from craft_text_detector import (
    read_image,
    load_craftnet_model,
    load_refinenet_model,
    get_prediction,
    export_detected_regions,
    export_extra_results,
    empty_cuda_cache
)

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# read image
image = read_image(image_path)

# load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)

# perform prediction
prediction_result = get_prediction(
    image=image,
    craft_net=craft_net,
    refine_net=refine_net,
    text_threshold=0.7,
    link_threshold=0.4,
    low_text=0.4,
    cuda=True,
    long_size=1280
)

# export detected text regions
exported_file_paths = export_detected_regions(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    output_dir=output_dir,
    rectify=True
)

# export heatmap, detection points, box visualization
export_extra_results(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    heatmaps=prediction_result["heatmaps"],
    output_dir=output_dir
)

# unload models from gpu
empty_cuda_cache()

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

930 Jan 4, 2023

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

24 Apr 28, 2022

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

81 Jan 1, 2023

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 6, 2022

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 6, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

Comments

Add more options for detect_text method

Hi, sometime I don't want detect_text from file, I want detect_text directly from image in ndarray format, that will save more cost of I/O time. So I contribute this. Thanks for your work

opened by ducviet00 2
Enable package to load model from local path

When using the pypi package it should be allowed to use a model from a local path, because loading it from a remote location removes the control over what model is currently used. And might also result in pull limits being reached.
enhancement

opened by TanjaBayer 1
Fix #8 - Fixing cuda issues in basic usage text detection

Fixing issue #8

In this quick-fix I referenced craft_net as a global variable. If this is not an acceptable workaround, then consider reorganizing the structure of the code.

Have a nice day :)

opened by gaborpelesz 1
accept customized weights path when loading models
path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")
opened by fcakyon 0

Releases(0.4.3)

0.4.3(May 9, 2022)
What's Changed

Enable package to load model from local path by @TanjaBayer in https://github.com/fcakyon/craft-text-detector/pull/53

New Contributors

@TanjaBayer made their first contribution in https://github.com/fcakyon/craft-text-detector/pull/53

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.2...0.4.3
Source code(tar.gz)
Source code(zip)
0.4.2(Jan 6, 2022)
What's Changed

fix opencv version by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/48

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.1...0.4.2
Source code(tar.gz)
Source code(zip)
0.4.1(Dec 20, 2021)
What's Changed

fix crop export by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/45

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.0...0.4.1
Source code(tar.gz)
Source code(zip)
0.4.0(Jul 30, 2021)
enhancement

fix boxes outside image boundaries (#37)

breaking changes

drop conda support, update python version (#38)

Source code(tar.gz)
Source code(zip)
0.3.5(May 12, 2021)
Rebuild conda binaries.

Source code(tar.gz)
Source code(zip)

0.3.4(Apr 7, 2021)

add support for PIL and numpy images in addition to filepath. https://github.com/fcakyon/craft-text-detector/pull/28

from PIL import Image
import numpy

# can be filepath, PIL image or numpy array
image = 'figures/idcard.png' 
image = Image.open("figures/idcard.png")
image = numpy.array(Image.open("figures/idcard.png"))

# apply craft text detection
prediction_result = craft.detect_text(image)

Source code(tar.gz)
Source code(zip)

0.3.3(Mar 2, 2021)
Relax requirements for OpenCV (#25)

Source code(tar.gz)
Source code(zip)

0.3.2(Mar 2, 2021)

path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")

Source code(tar.gz)
Source code(zip)

v0.3.1(May 14, 2020)
fix empty_cuda_cache

Source code(tar.gz)
Source code(zip)

v0.3.0(May 14, 2020)

updated basic usage for better device handling, now Craft instance should be created before calling detect_text:

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

some internal naming and styling changes

Source code(tar.gz)
Source code(zip)

v0.2.1(May 10, 2020)
fix cuda device bug

fix visualization export bug

Source code(tar.gz)
Source code(zip)
v0.2.0a(Apr 22, 2020)

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 22, 2020)
time profiling

better input size handling (with new long_size parameter)

bug fixes

Source code(tar.gz)
Source code(zip)

Owner

Senior Machine Learning Engineer, METU & Bilkent alum.

GitHub Repository

3点クリックで円を指定し、極座標変換を行うサンプルプログラム

click-warpPolar 3点クリックで円を指定し、極座標変換を行うサンプルプログラムです。 Requirements OpenCV 3.4.2 or Later Usage 実行方法は以下です。起動後、マウスで3点をクリックし円を指定してください。 python click-warpPol

17 Dec 30, 2022

Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

1 Nov 15, 2021

Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

12 Mar 11, 2022

TableBank: A Benchmark Dataset for Table Detection and Recognition

TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th

844 Jan 04, 2023

Table recognition inside douments using neural networks

TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco

93 Jul 24, 2022

Deep Learning Chinese Word Segment

引用本项目模型BiLSTM+CRF参考论文：http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文：https://arxiv.org/abs/1702.02098 构建安装好bazel代码构建工具，安装好tensorflow（目前本项目需

2.1k Dec 23, 2022

kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

12.3k Jan 05, 2023

ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022

一款基于Qt与OpenCV的仿真数字示波器

4 Nov 02, 2022

Introduction to image processing, most used and popular functions of OpenCV

👀 OpenCV 101 Introduction to image processing, most used and popular functions of OpenCV go here.

3 Jul 02, 2022

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

4 Nov 02, 2021

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Related tags

Overview

CRAFT: Character-Region Awareness For Text detection

Overview

Getting started

Installation

Basic Usage

Advanced Usage

You might also like...

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

This can be use to convert text in a file to handwritten text.

python ocr using tesseract/ with EAST opencv detector

Augmenting Anchors by the Detector Itself

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Comments

Add more options for detect_text method

Enable package to load model from local path

Fix #8 - Fixing cuda issues in basic usage text detection

accept customized weights path when loading models

Releases(0.4.3)

0.4.3(May 9, 2022)

What's Changed

New Contributors

0.4.2(Jan 6, 2022)

What's Changed

0.4.1(Dec 20, 2021)

What's Changed

0.4.0(Jul 30, 2021)

enhancement

breaking changes

0.3.5(May 12, 2021)

0.3.4(Apr 7, 2021)

0.3.3(Mar 2, 2021)

0.3.2(Mar 2, 2021)

v0.3.1(May 14, 2020)

v0.3.0(May 14, 2020)

v0.2.1(May 10, 2020)

v0.2.0a(Apr 22, 2020)

v0.2.0(Apr 22, 2020)

Owner

3点クリックで円を指定し、極座標変換を行うサンプルプログラム

Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Amazing 3D explosion animation using Pygame module.

TableBank: A Benchmark Dataset for Table Detection and Recognition

Table recognition inside douments using neural networks

Deep Learning Chinese Word Segment

kaldi-asr/kaldi is the official location of the Kaldi project.

ARU-Net - Deep Learning Chinese Word Segment

一款基于Qt与OpenCV的仿真数字示波器

Introduction to image processing, most used and popular functions of OpenCV

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Repositório para registro de estudo da biblioteca opencv (Python)

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

A Python script to capture images from multiple webcams at once and save them into your local machine

Pre-Recognize Library - library with algorithms for improving OCR quality.

Brief idea about our project is mentioned in project presentation file.

graph learning code for ogb

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

Controlling the computer volume with your hands // OpenCV

Here use convulation with sobel filter from scratch in opencv python .