Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Last update: Sep 15, 2022

Related tags

Deep Learning QVR-SimpleDLM

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Environment

CUDA="11.0"
CUDNN="8"
UBUNTU="18.04"

Install

bash install.sh
git clone https://github.com/NVIDIA/apex && cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install .
# under our project root folder
pip install .

Data Preparation

Our model is pre-trained on IIT-CDIP dataset, fine-tuned on FUNSD train set and evaluated on FUNSD test set and INV-CDIP test set.

Download our processed OCR results of IIT-CDIP with hocr_list_addr.txt and put under PRETRAIN_DATA_FOLDER/.
Download our processed FUNSD and INV-CDIP datasets and put under DATA_DIR/.

Reproduce Our Results

Download our model fine-tuned on FUNSD here.
Do inference following

# $MODEL_PATH here is where you save the fine-tuned model.
# DATASET_NAME is FUNSD or INV-CDIP.
bash reproduce_results.sh $MODEL_PATH $DATA_DIR/DATASET_NAME

You should get the following results.

Datasets	Precision	Recall	F1
FUNSD	60.4	60.9	60.7
INV-CDIP	50.5	47.6	49.0

Pre-training

You can skip the following steps by downloading our pre-trained SimpleDLM model here.
Or download layoutlm-base-uncased.
Do pre-training following

# $NUM_GPUS is the number of gpus you want to do the pretraining on. To reproduce the paper's results we recommend to use 8 gpus.
# $MODEL_PATH here is where you save the LayoutLM model.
# $PRETRAIN_DATA_FOLDER is the folder of IIT-CDIP hocr files.

python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS pretraining.py \
--model_name_or_path $MODEL_PATH  --data_dir $PRETRAIN_DATA_FOLDER \
--output_dir $OUTPUT_DIR

Fine-tuning

Do fine-tuning following

# $MODEL_PATH is where you save the pre-trained simpleDLM model.

CUDA_VISIBLE_DEVICES=0 python run_query_value_retrieval.py --model_type simpledlm --model_name_or_path $MODEL_PATH \
--data_dir $DATA_DIR/FUNSD/ --output_dir $OUTPUT_DIR --do_train --evaluate_during_training

Citation

If you find this codebase useful, please cite our paper:

@article{gao2021value,
  title={Value Retrieval with Arbitrary Queries for Form-like Documents},
  author={Gao, Mingfei and Xue, Le and Ramaiah, Chetan and Xing, Chen and Xu, Ran and Xiong, Caiming},
  journal={arXiv preprint arXiv:2112.07820},
  year={2021}
}

Contact

Please send an email to [email protected] or [email protected] if you have questions.

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Related tags

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Environment

Install

Data Preparation

Reproduce Our Results

Pre-training

Fine-tuning

Citation

Contact

Owner

Salesforce

Animation of solving the traveling salesman problem to optimality using mixed-integer programming and iteratively eliminating sub tours

Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Annotate with anyone, anywhere.

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

Reinforcement Learning for Automated Trading

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

This is the official code of our paper "Diversity-based Trajectory and Goal Selection with Hindsight Experience Relay" (PRICAI 2021)

Api's bulid in Flask perfom to manage Todo Task.

PyTorch implementation of "Optimization Planning for 3D ConvNets"

A simple API wrapper for Discord interactions.

The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

Direct LiDAR Odometry: Fast Localization with Dense Point Clouds

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Related tags

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Environment

Install

Data Preparation

Reproduce Our Results

Pre-training

Fine-tuning

Citation

Contact

Owner

Salesforce

Animation of solving the traveling salesman problem to optimality using mixed-integer programming and iteratively eliminating sub tours

Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Annotate with anyone, anywhere.

​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

Reinforcement Learning for Automated Trading

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

This is the official code of our paper "Diversity-based Trajectory and Goal Selection with Hindsight Experience Relay" (PRICAI 2021)

Api's bulid in Flask perfom to manage Todo Task.

PyTorch implementation of "Optimization Planning for 3D ConvNets"

A simple API wrapper for Discord interactions.

The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

Direct LiDAR Odometry: Fast Localization with Dense Point Clouds

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.