Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Overview

Deskew

by Marek Mauder
https://galfar.vevb.net/deskew
https://github.com/galfar/deskew

v1.30 2019-06-07

Overview

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

There are binaries built for these platforms (located in Bin folder): Win64 (deskew.exe), Win32 (deskew32.exe), Linux 64bit (deskew), macOS (deskew-mac), Linux ARMv7 (deskew-arm).

GUI frontend for this CLI tool is available as well (Windows, Linux, and macOS).

License: MIT

Downloads And Releases

https://github.com/galfar/deskew/releases
https://galfar.vevb.net/deskew#downloads

Usage

Usage:
deskew [-o output] [-a angle] [-b color] [..] input
    input:         Input image file
  Options:
    -o output:     Output image file (default: out.png)
    -a angle:      Maximal expected skew angle (both directions) in degrees (default: 10)
    -b color:      Background color in hex format RRGGBB|LL|AARRGGBB (default: black)
  Ext. options:
    -q filter:     Resampling filter used for rotations (default: linear,
                   values: nearest|linear|cubic|lanczos)
    -t a|treshold: Auto threshold or value in 0..255 (default: a)
    -r rect:       Skew detection only in content rectangle (pixels):
                   left,top,right,bottom (default: whole page)
    -f format:     Force output pixel format (values: b1|g8|rgb24|rgba32)
    -l angle:      Skip deskewing step if skew angle is smaller (default: 0.01)
    -g flags:      Operational flags (any combination of):
                   c - auto crop, d - detect only (no output to file)
    -s info:       Info dump (any combination of):
                   s - skew detection stats, p - program parameters, t - timings
    -c specs:      Output compression specs for some file formats. Several specs
                   can be defined - delimited by commas. Supported specs:
                   jXX - JPEG compression quality, XX is in range [1,100(best)]
                   tSCHEME - TIFF compression scheme: none|lzw|rle|deflate|jpeg|g4

  Supported file formats
    Input:  BMP, JPG, PNG, JNG, GIF, DDS, TGA, PBM, PGM, PPM, PAM, PFM, TIF, PSD
    Output: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PGM, PPM, PAM, PFM, TIF, PSD

Notes

For TIFF support in Linux and macOS you need to have libtiff 4.x installed (package is usually called libtiff5).

For macOS you can download prebuilt libtiff binaries here: https://galfar.github.io/store/TiffLibBins-macOS.zip. Just put the files inside the archive to the same folder as deskew-mac executable.

You can find some test images in TestImages folder and scripts to run tests (RunTests.bat and runtests.sh) in Bin. By default scripts just call deskew command but you can pass a different one as a parameter (e.g. runtests.sh deskew-arm).

Bugs, Issues, Proposals

File them here:
https://github.com/galfar/deskew/issues

Version History

v1.30 2019-06-07:

  • fix #15: Better image quality after rotation - better default and also selectable nearest|linear|cubic|lanczos filtering
  • fix #5: Detect skew angle only (no rotation done) - optionally only skew detection
  • fix #17: Optional auto-crop after rotation
  • fix #3: Command line option to set output compression - now for TIFF and JPEG
  • fix #12: Bad behavior when an output is given and no deskewing is needed
  • libtiff in macOS is now picked up also when binaries are put directly in the directory with deskew
  • text output is flushed after every write (Linux/Unix): it used to be flushed only when writing to device but not file/pipe.

v1.25 2018-05-19:

  • fix #6: Preserve DPI measurement system (TIFF)
  • fix #4: Output image not saved in requested format (when deskewing is skipped)
  • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed

v1.21 2017-11-01:

  • fix #8: Cannot compile in Free Pascal 3.0+ (Windows) - Fails to link precompiled LibTiff library
  • fix #7: Windows FPC build fails with Access violation exception when loading certain TIFFs (especially those saved by Windows Photo Viewer etc.)

v1.20 2016-09-01:

  • much faster rotation, especially when background color is set (>2x faster, 2x less memory)
  • can skip deskewing step if detected skew angle is lower than parameter
  • new option for timing of individual steps
  • fix: crash when last row of page is classified as text
  • misc: default back color is now opaque black, new forced output format "rgb24", background color can define also alpha channel, nicer formatting of text output

v1.10 2014-03-04:

  • TIFF support for Win64 and 32/64bit Linux
  • forced output formats
  • fix: output file names were always lowercase
  • fix: preserves resolution metadata (e.g. 300dpi) of input when writing output

v1.00 2012-06-04:

  • background color
  • "area of interest" content rectangle
  • 64bit and Mac OSX support
  • PSD and TIFF (win32) support
  • show skew detection stats and program parameters

v0.95 2010-12-28:

  • Added auto thresholding

v0.90 2010-02-12:

  • Initial version

Compiling Deskew

Deskew is written in Object Pascal. You need Free Pascal or Delphi to recompile it.

Tested Compilers

There are project files for these IDEs:

  1. Lazarus 2.0.10 (deskew.lpi)
  2. Delphi XE + 10.3 (deskew.dproj)

Additionally, there are compile shell/batch scripts for standalone FPC compiler in Scripts folder.

Supported/Tested Platforms

Deskew is precompiled and was tested on these platforms: Win32, Win64, Linux 64bit, macOS 64bit, Linux ARMv7

Source Code

Latest source code can be found here:
https://github.com/galfar/deskew

Dependencies

Vampyre Imaging Library is needed for compilation and it's included in Deskew's repo in Imaging folder.

Comments
  • Detect skew angle only (no rotation done)

    Detect skew angle only (no rotation done)

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    As requested on blog:

    Can you explain how to simply find the angle but not rotate using this tool? Since I’m dealing with archival TIFFs I need to keep the DPI and embedded metadata in place, so I’m thinking I would use ImageMagick to rotate once I have the angle. Thanks.

    Answer:

    For now you could use -l parameter: -l angle: Skip deskewing step if skew angle is smaller And use some large threshold so rotation will always be skipped.

    $deskew -l 80 Sken003.png
    ...
    Preparing input image (Sken003.png) ...
    Calculating skew angle...
    Skew angle found: 0.23
    Skipping deskewing step, skew angle lower than threshold of 80.00
    Done!
    

    For next version I plan to modify this: angle is optional and if omitted rotation is always skipped.

    major DeskewCmdLine proposal 
    opened by galfar 7
  • Please clarify licensing situation

    Please clarify licensing situation

    README says that the license is MIT, but the source files seem to all claim MPL/LGPL. What is actual license of this code? Can you please distribute a license file?

    This came up at AUR: https://aur.archlinux.org/packages/deskew-git/

    opened by ctrlcctrlv 5
  • Parameters on DeskewGui v0.90

    Parameters on DeskewGui v0.90

    Hello,

    After doing some minor testing with the default and the lanczos filters, I agree that the quality under lanczos is noticeably better, and in my old computer, it doesn't take much longer to process (I would say that a couple of seconds per page).

    In conclusion, I'd like to use lanczos from now on, but DeskewGui doesn't allow me, to the best of my knowledge, to include parameters when I call deskew.exe.

    Is there way I can tell the programme that I want to use lanczos?

    Thanks!

    DeskewGui 
    opened by vivadavid 5
  • EImagingError: Error while loading images from file.... Exception message: Access violation

    EImagingError: Error while loading images from file.... Exception message: Access violation

    Original report by Anonymous.


    When trying to deskew landscape images on Windows, the error message:

    EImagingError: Error while loading images from file "name of file" <format: tif> Exception message: Access violation

    appears.

    bug major 
    opened by galfar 4
  • Better image quality after rotation

    Better image quality after rotation

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    Received several "complaints" about images after rotation step to be a bit blurry (compared to doing rotation in ImageMagick etc.).

    Current "speed over quality" rotation algorithm used in Deskew must be replaced/supplemented with another one.

    enhancement DeskewCmdLine critical 
    opened by galfar 3
  • Simple GUI frontend

    Simple GUI frontend

    Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


    I get many requests for "process all files in folder" etc. from people not comfortable with command line, shell scripts etc.

    Simple GUI frontend (to cmd. line Deskew) with batch processing capability would be nice for many people.

    Binaries for Windows, macOS, and Linux required.

    major proposal DeskewGui 
    opened by galfar 3
  • GUI: Option to select sampling filter

    GUI: Option to select sampling filter

    The GUI does not currently have an option for this, and I don't know any other way of doing batch image processing with the CLI tool, the default linear filter option blurs the images too much, so it would be nice to be able to change it in the GUI.

    opened by NebulaOnion 2
  • Support for multipage pdf files

    Support for multipage pdf files

    Hi, thanks for this program. I think it would be useful to support to deskew a whole pdf file with multiple pages. I normally scan books or documents directly into a multipage pdf file, since it is more manageable to only have one file and not one per page. Do you think that this might be within the scope of this program?

    Cheers

    opened by cristobaltapia 2
  • DeskewGui: default window is too big for some common screens

    DeskewGui: default window is too big for some common screens

    Original report by Anonymous.


    The default window for DeskewGui is a little too big. In some screens, in particular laptop screens with 1366x800 pixels, it is somewhat difficult to redimension the window because the window title bar is left off screen. The maximum height should be no bigger than about 600 pixels. Thank you for your nice work.

    bug minor DeskewGui 
    opened by galfar 2
  • Compiling the GUI on Linux?

    Compiling the GUI on Linux?

    Hello @galfar, I'm your AUR maintainer

    I noticed you have a GUI now, but in Scripts there only seems to be a script to compile it on Mac.

    I get:

    deskewgui.lpr(10,3) Fatal: Can't find unit Interfaces used by deskewgui
    

    Is this usable on Linux?

    opened by ctrlcctrlv 1
  • GUI: allow passing extra parameters to CLI

    GUI: allow passing extra parameters to CLI

    As GUI will always lag behind a bit and won't provide all the options CLI can handle it may be useful to allow passing extra parameters directly from GUI to CLI.

    A new text edit in "Advanced options" should take care of it.

    enhancement DeskewGui 
    opened by galfar 1
  • feature request: release for linux arm64

    feature request: release for linux arm64

    It will be awesome to have more releases a) linux ARM64 and b) macOS-arm64. I would like to use this CLI in linux docker running over Apple M1 computer.

    opened by amitm02 2
  • Auto detect content rectangle

    Auto detect content rectangle

    Content rectangle can be auto detected by scanning the image (after thresholding) from the sides and looking where non-white pixels start. If it's fast enough it could be a default setting. If custom content rectangle is passed as a parameter by the user let's not use the detection.

    enhancement DeskewCmdLine 
    opened by galfar 0
  • Options for Specifying Skew Angle

    Options for Specifying Skew Angle

    I am looking for a cmd option that can be used to specify the angle that is being calculated before but i was not able to find it in the given set of options. I am applying some preprocessing on same image but has different preprocessed version. But when I run deskew on both of these images I get different skew angles. I want to use same angle for both versions of same image.

    opened by hamxahbhatti 2
  • Take the background color from the input image

    Take the background color from the input image

    Request came in to extend "-b" background color parameter to take it's value from the input image (edge/corner). https://galfar.vevb.net/wp/projects/deskew/comment-page-2/#comment-184847

    One request if at all possible is can there be an option for -b that auto samples the rgb value of an edge. I currently run an auto-crop script after they are deskewed and some pages have a different background color. This sometimes trips up the auto crop into thinking the added background is an edge. I’m hoping for something a little more dynamic that allows me to use the tool without sorting the material first.

    Hi, you mean something like “look at pixel [0,0] of input image and use it for output background”?

    Yes. “look at pixel [0,0] of input image and use it for output background” would be a great additional feature.

    enhancement 
    opened by galfar 8
Releases(v1.30)
  • v1.30(Jun 18, 2019)

    Command line tool for deskewing scanned documents. Binaries for several platforms and test images included.

    README

    Recent changes:

    v1.30 2019-06-07:

    • fix #15: Better image quality after rotation - better default and also selectable nearest|linear|cubic|lanczos filtering
    • fix #5: Detect skew angle only (no rotation done) - optionally only skew detection
    • fix #17: Optional auto-crop after rotation
    • fix #3: Command line option to set output compression - now for TIFF and JPEG
    • fix #12: Bad behavior when an output is given and no deskewing is needed
    • libtiff in macOS is now picked up also when binaries are put directly in the directory with deskew
    • text output is flushed after every write (Linux/Unix): it used to be flushed only when writing to device but not file/pipe.

    v1.25 2018-05-19:

    • fix #6: Preserve DPI measurement system (TIFF)
    • fix #4: Output image not saved in requested format (when deskewing is skipped)
    • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed
    Source code(tar.gz)
    Source code(zip)
    Deskew-1.30.zip(4.29 MB)
  • gui-v0.90(Jan 4, 2019)

    GUI Frontend for Deskew Command Line Tool

    Now it’s easier to process many files without writing shell scripts. It needs the command line tool which is called for the each input file. You can set the basic and most of the advanced options for deskewing in the GUI.

    Prebuilt executables for Windows and Linux are available in the download – you just place them to the same folder as the command line tool. Version for macOS is a bit more convenient – it’s a self-contained app bundle with CLI tool already inside and all placed in DMG image. You can also set the explicit path to the command line tool in the program itself.

    Source code(tar.gz)
    Source code(zip)
    DeskewGui-0.90.zip(4.11 MB)
  • v1.25(Jan 4, 2019)

    Command line tool for deskewing scanned documents. Binaries for several platforms and test images included.

    Changes since the last release:

    1.25 2018-05-19:

    • fix #6: Preserve DPI measurement system (TIFF)
    • fix #4: Output image not saved in requested format (when deskewing is skipped)
    • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed

    1.21 2017-11-01:

    • fix #8: Cannot compile in Free Pascal 3.0+ (Windows) - Fails to link precompiled LibTiff library
    • fix #7: Windows FPC build fails with Access violation exception when loading certain TIFFs (especially those saved by Windows Photo Viewer etc.)
    Source code(tar.gz)
    Source code(zip)
    deskew-125.zip(4.31 MB)
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

Harald Scheidl 190 Jan 03, 2023
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 109 Dec 12, 2022
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

1 Dec 22, 2021
Table Extraction Tool

Tree Structure - Table Extraction Fonduer has been successfully extended to perform information extraction from richly formatted data such as tables.

HazyResearch 88 Jun 02, 2022
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
Introduction to Augmented Reality (AR) with Python 3 and OpenCV 4.2.

Introduction to Augmented Reality (AR) with Python 3 and OpenCV 4.2.

fernanda rodríguez 85 Jan 02, 2023
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 17 Aug 25, 2022
A python programusing Tkinter graphics library to randomize questions and answers contained in text files

RaffleOfQuestions Um programa simples em python, utilizando a biblioteca gráfica Tkinter para randomizar perguntas e respostas contidas em arquivos de

Gabriel Ferreira Rodrigues 1 Dec 16, 2021
A community-supported supercharged version of paperless: scan, index and archive all your physical documents

Paperless-ngx Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep,

5.2k Jan 04, 2023
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
A tool to enhance your old/damaged pictures built using python & opencv.

Breathe Life into your Old Pictures Table of Contents About The Project Getting Started Prerequisites Usage Contact Acknowledgments About The Project

Shah Anwaar Khalid 5 Dec 16, 2021
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 03, 2023
基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

20 Dec 15, 2022
Select range and every time the screen changes, OCR is activated.

ASOCR(Auto Screen OCR) Select range and every time you press Space key, OCR is activated. 範囲を選ぶと、あなたがスペースキーを押すたびに、画面が変わる度にOCRが起動します。 usage1: simple OC

1 Feb 13, 2022
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

Arthur Flôr 160 Dec 21, 2022