OCR system for Arabic language that converts images of typed text to machine-encoded text.

Last update: Jan 05, 2023

Overview

Arabic OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.
The system currently supports only letters (29 letters) ا-ى , لا.
The system aims to solve a simpler problem of OCR with images that contain only Arabic characters (check the dataset link below to see a sample of the images).

Setup

Install python then run this command:

pip install -r requirements.txt

Run

Put the images in src/test directory
Go to src directory and run the following command
```
python OCR.py
```
Output folder will be created with:
- text folder which has text files corresponding to the images.
- running_time file which has the time taken to process each image.

Pipeline

Dataset

Link to dataset of images and the corresponding text: here.
We used 1000 images to generate character dataset that we used for training.

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

Average accuracy: 95%.
Average time per image: 16 seconds.

NOTE

We achieved these results when we used only the flatten image as feature.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Related tags

Overview

Arabic OCR

Setup

Run

Pipeline

Dataset

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

References

Owner

Hussein Youssef

document image degradation

Scene text recognition

Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Some bits of javascript to transcribe scanned pages using PageXML

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

Markup for note taking

list all open dataset about ocr.

An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

FastOCR is a desktop application for OCR API.

Demo processor to illustrate OCR-D Python API

Image Smoothing and Blurring Using OpenCV

Here use convulation with sobel filter from scratch in opencv python .

Rotational region detection based on Faster-RCNN.

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

Primary QPDF source code and documentation

Framework for the Complete Gaze Tracking Pipeline

Recognizing the text contents from a scanned visiting card

BoxToolBox is a simple python application built around the openCV library

Use Youdao OCR API to covert your clipboard image to text.