Page to PAGE Layout Analysis Tool

Overview

P2PaLA

Python Version Code Style

Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

๐Ÿ’ฅ Try our new DEMO for online baseline detection. โ— โ—

If you find this toolkit useful in your research, please cite:

@misc{p2pala2017,
  author = {Lorenzo Quirรณs},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{https://github.com/lquirosd/P2PaLA}},
}

Check this paper for more details Arxiv.

Requirements

  • Linux (OSX may work, but untested.).
  • Python (2.7, 3.6 under conda virtual environment is recomended)
  • Numpy
  • PyTorch (1.0). PyTorch 0.3.1 compatible on this branch
  • OpenCv (3.4.5.20).
  • NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN works, but is not recomended for training).
  • tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch

Install

python setup.py install

To install python dependencies alone, use requirements file conda env create --file conda_requirements.yml

Usage

  1. Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:
mkdir -p data/{train,val,test,prod}/page;
tree data;
data
โ”œโ”€โ”€ prod
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ prod_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ prod_1.xml
โ”‚   โ”œโ”€โ”€ prod_0.jpg
โ”‚   โ””โ”€โ”€ prod_1.jpg
โ”œโ”€โ”€ test
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ test_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ test_1.xml
โ”‚   โ”œโ”€โ”€ test_0.jpg
โ”‚   โ””โ”€โ”€ test_1.jpg
โ”œโ”€โ”€ train
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ train_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ train_1.xml
โ”‚   โ”œโ”€โ”€ train_0.jpg
โ”‚   โ””โ”€โ”€ train_1.jpg
โ””โ”€โ”€ val
    โ”œโ”€โ”€ page
    โ”‚   โ”œโ”€โ”€ val_0.xml
    โ”‚   โ””โ”€โ”€ val_1.xml
    โ”œโ”€โ”€ val_0.jpg
    โ””โ”€โ”€ val_1.jpg
  1. Run the tool.
python P2PaLA.py --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

โ— Pre-trained models available here

  1. Use TensorBoard to visualize train status:
tensorboard --logdir ./work/runs
  1. xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

  1. For detail about arguments and config file, see docs or python P2PaLa.py -h.
  2. For more detailed example see egs:
    • Bozen dataset see
    • cBAD complex competition dataset see
    • OHG dataset see

License

GNU General Public License v3.0 See LICENSE to see the full text.

Acknowledgments

Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

Owner
Lorenzo Quirรณs Dรญaz
Lorenzo Quirรณs Dรญaz
Steve Tu 71 Dec 30, 2022
Face Detection with DLIB

Face Detection with DLIB In this project, we have detected our face with dlib and opencv libraries. Setup This Project Install DLIB & OpenCV You can i

Can 2 Jan 16, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.5k Jan 03, 2023
Detect and fix skew in images containing text

Alyn Skew detection and correction in images containing text Image with skew Image after deskew Install and use via pip! Recommended way(using virtual

Kakul 230 Dec 21, 2022
Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

Harald Scheidl 190 Jan 03, 2023
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
This repository provides train๏ผ†test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

SCUT-CTW1500 Datasets We have updated annotations for both train and test set. Train: 1000 images [images][annos] Additional point annotation for each

Yuliang Liu 600 Dec 18, 2022
Visual Attention based OCR

Attention-OCR Authours: Qi Guo and Yuntian Deng Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to hei

Yuntian Deng 1.1k Jan 02, 2023
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
TedEval: A Fair Evaluation Metric for Scene Text Detectors

TedEval: A Fair Evaluation Metric for Scene Text Detectors Official Python 3 implementation of TedEval | paper | slides Chae Young Lee, Youngmin Baek,

Clova AI Research 167 Nov 20, 2022
Rotational region detection based on Faster-RCNN.

R2CNN_Faster_RCNN_Tensorflow Abstract This is a tensorflow re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detecti

UCAS-Det 581 Nov 22, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
Fatigue Driving Detection Based on Dlib

Fatigue Driving Detection Based on Dlib

5 Dec 14, 2022
Awesome multilingual OCR toolkits based on PaddlePaddle ๏ผˆpractical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices๏ผ‰

English | ็ฎ€ไฝ“ไธญๆ–‡ Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 08, 2023
One Metrics Library to Rule Them All!

onemetric Installation Install onemetric from PyPI (recommended): pip install onemetric Install onemetric from the GitHub source: git clone https://gi

Piotr Skalski 49 Jan 03, 2023