An interactive document scanner built in Python using OpenCV

Last update: Feb 12, 2022

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images 
   
     | --image 
    
     ) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

An interactive document scanner built in Python using OpenCV

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

Here are some examples of images before and after scan:

Usage

Owner

Kushal Shingote

Text layer for bio-image annotation.

Detect textlines in document images

Indonesian ID Card OCR using tesseract OCR

Application that instantly translates sign-language to letters.

This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Detect and fix skew in images containing text

Super Mario Game With Python

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Text page dewarping using a "cubic sheet" model

Web interface for browsing arXiv papers

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

This repository summarized computer vision theories.

BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

Visual Attention based OCR

An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

SemTorch