Key information extraction from invoice document with Graph Convolution Network

Last update: Dec 16, 2022

Overview

Key Information Extraction from Scanned Invoices

Key information extraction from invoice document with Graph Convolution Network

Related blog post from my Viblo account: https://viblo.asia/p/djeZ1yPGZWz

Models

Background subtraction: U2Net
Image alignment: based-on output of text-detection & cv2
Text detection: CRAFT and an in-house text-detection model
Text recognition: VietOCR and an in-house text-recognition model
KIE: Graph Convolution

Currently, I dont have the invoice-direction classifier model. But you can also develop a model to rotate the image if the image is rotated horizontally or upside down.

Pretrained model

Google Drive

Data

MC-OCR, a Vietnamese receipts dataset: https://aihub.vn/competitions/1
Preprocessed data: Google Drive

Pipeline

TODO

Command

Create virtual environment using conda or virtualenv

# with virtualenv
virtualenv -p python3 invoice_env
# activate environment
source invoice_env/bin/activate
# install prerequisite libraries
pip install -r requirements.txt

# 1st command, run API
make serve
# 2nd command, run web-gui with streamlit
make runapp

Then access the localhost server at: 0.0.0.0:7778

Preview

TODO

Add preprocess data script

Reference

MC-OCR dataset: https://aihub.vn/competitions/1
U2Net: https://github.com/xuebinqin/U-2-Net
CRAFT: https://github.com/clovaai/CRAFT-pytorch
VietOCR: https://github.com/pbcquoc/vietocr
Benchmarking GNNs: https://github.com/graphdeeplearning/benchmarking-gnns
PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Key information extraction from invoice document with Graph Convolution Network

Related tags

Overview

Key Information Extraction from Scanned Invoices

Models

Pretrained model

Data

Pipeline

Command

Preview

TODO

Reference

Owner

Phan Hoang

Efficient Householder transformation in PyTorch

Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

Deep Federated Learning for Autonomous Driving

An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

PyTorch implementation of UNet++ (Nested U-Net).

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

A hyperparameter optimization framework

REBEL: Relation Extraction By End-to-end Language generation

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

Nonnegative spatial factorization for multivariate count data

Combine Tacotron2 and Hifi GAN to generate speech from text

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

DeepVoxels is an object-specific, persistent 3D feature embedding.

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.