Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow

Last update: May 11, 2022

Overview

Digitalizing-Prescription-Image

PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow that digitalises images of Prescription of Handwritten Texts by Doctors.

Abstract

PIRDS does the Digital transformation of hand-written prescription text using advance image processing techniques and deep learning methods. Image processing techniques helps to create images which are less noisy, and easily understandable for neural networks.

Once image with required configuration are obtained, they are fed to neural network model for training. The neural network model consists of, convolutional neural network for feature extraction, recurrent neural networks for dealing with character’s sequencing. We use connectionist temporal classification loss function which is required to be minimized to get good recognition of words from images.

Work Flow

The raw data are one-page scans, provided as a Images/PDF. The first step is to anonymize the data. Hashes are calculated from document IDs, and a region of interest (ROI) is cut out of the document, which includes the handwriting, but which EXCLUDES any personal data, such as the physician’s signature, the date and place of decease, etc.
This yields smaller images than the originals, and there is no link from the images back to the original scans. The second step is to clean the images. There is background text from the document template, and there are scan errors. We remove the background; we apply noise reduction and a slight blurring to close small gaps in the handwriting lines while retaining spaces between words.
The third step is to crop the image to the smallest size possible containing the handwriting. The fourth step is to cut between the lines. Therefore, when the text has N lines, we end up with N image segments per original certificate.
We then apply a neural network (NN) to predict what is written; with a calculated confidence of how certain, the NN is of the correctness of the prediction. Predictions that include unknown words require additional natural language processing (NLP) to map it to known words. Again, we calculate a confidence level.
To summarize, the solution for reading the handwriting is a combination of image processing, deep learning, and natural language processing.

Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow

Related tags

Overview

Digitalizing-Prescription-Image

Abstract

Work Flow

Owner

Akshat Surolia

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

GANTheftAuto is a fork of the Nvidia's GameGAN

State-to-Distribution (STD) Model

Housing Price Prediction

Official implementation of the paper DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows

Weakly Supervised 3D Object Detection from Point Cloud with Only Image Level Annotation

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Robotic Process Automation in Windows and Linux by using Driagrams.net BPMN diagrams.

Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

Machine Learning toolbox for Humans

Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

DziriBERT: a Pre-trained Language Model for the Algerian Dialect

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

Probabilistic Tensor Decomposition of Neural Population Spiking Activity

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

Perfect implement. Model shared. x0.5 (Top1:60.646) and 1.0x (Top1:69.402).

The code for our paper Semi-Supervised Learning with Multi-Head Co-Training

The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System