Form Segmentation

Let's explore how we can extract text from any forms / scanned pages.

Objectives

The goal is to find an algorithm that can extract the maximum information from a given page (jpg format). So, we can feed it to another system. (Business logic, neural network, classifier, etc.) The overall process may not be perfect. But it would be great if it can find enough information to identify the type of document and the involve identities.

Parse any form / scanned page and extract any text data (printed text and handwriting text). So, no prior knowledge of the layout / structure of the document.
Automatic extraction process (no human interaction. So, it can scale out)
Somehow fast (or the ability to speed up the task with more machines or CPU)

Challenges

There are many challenges to overcome. But the main problem is to identify which part of the form contains text.

Some other challenges:

Black Border Removal
ICR (Intelligent Character Recognition): recognize and convert hand-drawn characters into text
Scanned page (Detect edges and apply a perspective transform to obtain the top-down view of the document)
Remove noise (blur, OTSU, adaptivethreshold with opencv)
Shape detection and extraction
OCR (Not a real issue since we can use : Tesseract 4 great for printed text)
Handwriting recognition
Minimize errors

Let's explore how we can extract text from forms

Related tags

Overview

Form Segmentation

Objectives

Challenges

Owner

Philip Doxakis

Qrcode Attendence System with Opencv and Pyzbar

🖺 OCR using tensorflow with attention

BoxToolBox is a simple python application built around the openCV library

OCR of Chicago 1909 Renumbering Plan

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

a micro OCR network with 0.07mb params.

Single Shot Text Detector with Regional Attention

An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

Reference Code for AAAI-20 paper "Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels"

A simple python program to record security cam footage by detecting a face and body of a person in the frame.

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

Implementation of EAST scene text detector in Keras

Line based ATR Engine based on OCRopy

Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

A toolbox of scene text detection and recognition

Virtual Zoom Gesture using OpenCV

Open Source Computer Vision Library

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection