Machine Learning to Denoise Images for Better OCR Accuracy

This project is an adaptation of this tutorial and used only for learning purposes: https://www.pyimagesearch.com/2021/10/20/using-machine-learning-to-denoise-images-for-better-ocr-accuracy/#download-the-code

Setting Up the project 🚀

First and foremost clone the project with:

$ git clone https://github.com/AntonioBriPerez/Ocr-Denoiser

You don't need to extract the zip files in order to train the model.

Once you have cloned the repository you will need to extract the features from the noisy images. This script will extract 5 x 5 - 25-d feature vectors and the it will extract the target (or cleaned) pixel value from the correspondiente ground truth standard image. And then, this features will be saved in a csv file (~200MB). To extract this features you will have to execute:

$ python3 build_features.py

It will generate the following output:

Once you have done that we will have to load those features in a proper split to train our Random Forest Regressor. That code is implemented in the file train_denoiser.py. To train the model you will have to run the command:

$ python train_denoiser.py

And it will generate:

To check that the model performs good you can execute:

$ python3 denoise_document.py --testing denoising-dirty-documents/test

And some images will be written in disk so you can check the original image and the image obtained by the model we just have trained.

Any doubts or suggestions please open an issue.

Machine Leaning applied to denoise images to improve OCR Accuracy

Related tags

Overview

Machine Learning to Denoise Images for Better OCR Accuracy

Setting Up the project 🚀

Owner

Antonio Bri Pérez

BNF Globalization Code (CVPR 2016)

Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ocroseg - This is a deep learning model for page layout analysis / segmentation.

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

An expandable and scalable OCR pipeline

Extract tables from scanned image PDFs using Optical Character Recognition.

Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

Generic framework for historical document processing

Simple app for visual editing of Page XML files

Deep learning based page layout analysis

ERQA - Edge Restoration Quality Assessment

This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe libraries.

Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

Pixel art search engine for opengameart

Table Extraction Tool

WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Convert scans of handwritten notes to beautiful, compact PDFs

A list of hyperspectral image super-solution resources collected by Junjun Jiang