Document Image Dewarping

Last update: Dec 23, 2022

Overview

Document image dewarping using text-lines and line Segments

Abstract

Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that all the transformed line segments are still straight (line to line mapping), and many of them are horizontally or vertically aligned in the well-rectified images, we encode this properties into the cost function in addition to the text-line based cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve (extrinsic parameters) and camera focal length (intrinsic parameter), which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well horizontal/vertical aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts. Moreover, the proposed method can extend to general curves surfaces as well as document.

Algorithm

Two line semgent properties

Straightness property

The straightness property describes the line segments extracted in curved document image, lines on the curved document surface become still straight in the well-rectified domain (Although the lines extracted in the well-rectified image can be curved in the curved document surface). It means that line-to-line mapping. Since the straightness property is always satisfied with all plane to plane mapping, it is not a significant constraint in rectification considering only camera view (such as homography). However we consider page curve as well as camera view in rectification process, then this property becomes an efficient constraint that prevents lines from being curved.

Alignment property

Based on the observation that the majority of line segments are horizontally or vertically aligned in the rectified images.

Outlier removal

The direct optimization of may yield poorly rectified results, due to outliers. We treat two outlier types that are missed text-lines and line segments having arbitrary direction (non horizontal/vertical). For the outlier removal, we design an iterative method. At each step, we refine the features (text components and line segments) by removing outlier (that are not well aligned) and minimize the cost function with updated inliers.

Experimental results

CBDAR 2007 dataset

We evaluate our method on the CBDAR 2007 dewarpint contest dataset [http://staffhome.ecm.uwa.edu.au/~00082689/downloads.html], that is consisted of binarized text images.

Input image	Kim [2]	Proposed

Our document image dataset

In order to consist of non conventional document images (i.e., not text-abundant cases), we collected 100 images having various layouts (e.g., three column documents, documents containing large tables and/or figures, presentation slides, and so on).

Input image	Kim [2]	Proposed

Our curved image dataset

In order to consist of general curved surface images (such as bottles), we collected 74 images.

Input image	Kim [2]	Proposed

Executable program

Executable program can be downloaded by below links:

http://ispl.synology.me:8480/sharing/uA2DTRA8U

Reference

[1] Taeho Kil, Wonkyo Seo, Hyung Il Koo and Nam Ik Cho, "Robust Document Image Dewarping Using Text-Line and Line Segments", ICDAR 2017.

[2] Beom Su Kim, Hyung Il Koo, and Nam Ik Cho, "Document Dewarping via Text-line based Optimization", Pattern Recognition 2015.

Document Image Dewarping

Related tags

Overview

Document image dewarping using text-lines and line Segments

Abstract

Algorithm

Two line semgent properties

Straightness property

Alignment property

Outlier removal

Experimental results

CBDAR 2007 dataset

Our document image dataset

Our curved image dataset

Executable program

Reference

Owner

Taeho Kil

Tool which allow you to detect and translate text.

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

Table recognition inside douments using neural networks

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

chineseocr/table_line 表格线检测模型pytorch版

原神风花节自动弹琴辅助

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Reference Code for AAAI-20 paper "Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels"

https://arxiv.org/abs/1904.01941

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Kornia is a open source differentiable computer vision library for PyTorch.

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

This Repository contain Opencv Projects in python