scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.

Overview

Scan Tailor - scantailor.org

ScanTailor logo from scantailor.org

This project is no longer maintained, and has not been maintained for a while.

About

Scan Tailor is an interactive post-processing tool for scanned pages. It performs operations such as:

You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DJVU file. Scanning, optical character recognition, and assembling multi-page documents are out of scope of this project.

Scan Tailor is Free Software (which is more than just freeware). It’s written in C++ with Qt and released under the General Public License version 3. We develop both Windows and GNU/Linux versions.

History and Future

This project started in late 2007 and by mid 2010 it reached production quality.

In 2014, the original developer Joseph Artsimovich stepped aside, and Nate Craun (@ncraun) took over as the new maintainer.

For information on contributing and the longstanding plan for the project, please see the Roadmap wiki entry.

For any suggested changes or bugs, please consult the Issues tab.

Usage

Scan Tailor is being used not just by enthusiasts, but also by libraries and other institutions. Scan Tailor processed books can be found on Google Books and the Internet Archive.

  • Prolog for Programmers. The 47.3MB pdf is the original, and the 3.1MB pdf is after using Scan Tailor. The OCR, Chapter Indexing, JBIG2 compression, and PDF Binding were not done with Scan Tailor, but all of the scanned image cleanup was. [1]
  • Oakland Township: Two Hundred Years by Stuart A. Rammage (also available: volumes 2, 3, 4.1, 4.2, 5.1, and 5.2) [2]
  • Herons and Cobblestones: A History of Bethel and the Five Oaks Area of Brantford Township, County of Brant by the Grand River Heritage Mines Society [2]

Installation and Tips

Scanning Tips, Quick-Start-Guide, and complete Usage Guide, including installation information (via the installer or building from from source) can be found in the wiki!

Additional Links

Comments
  • Bugfix: scantailor-cli doesnt honor color-mode settings from projectfile...

    Bugfix: scantailor-cli doesnt honor color-mode settings from projectfile...

    .... It alwas convertes pictures to black and white instead of color

    Found when using DIYBookscanner/spreads when manually editing via gui and letting spreads continue running scantailor-cli --start-filter=6 /tmp/spreads.TaZcoK/tmpuNi7DS-0.ScanTailor /tmp/st-outEvBCSO

    opened by mumme74 2
  • Only set options from command-line if they were explicitely specified

    Only set options from command-line if they were explicitely specified

    This patch fixes an issue with scantailor-cli that caused settings from the configuration file to be overwritten by the command-line defaults. (see https://github.com/DIYBookScanner/spreads/pull/112#issuecomment-48022697) This occured due to the fact that QMap<QString, QString> m_options in the CommandLine class would set a key once it was queried.

    As an example, the following method would be called before a later call to hasColorParams, which checks if an option was set from the commandline by calling contains on the QMap:

    output::ColorParams::ColorMode
    CommandLine::fetchColorMode()
    {
        // This seems to set "color-mode" in the QMap
        QString cm = m_options["color-mode"].toLower();
    
        if (cm == "color_grayscale")
            return output::ColorParams::COLOR_GRAYSCALE;
        else if (cm == "mixed")
            return output::ColorParams::MIXED;
    
        return output::ColorParams::BLACK_AND_WHITE;
    }
    

    When the program would later check for a user-supplied color parameter from the command-line via hasColorParams it would always return true, even though the --color-mode flag was never set.

    As a solution, the new code now only calls fetch* methods if the corresponding has* method returns true.

    opened by jbaiter 2
  • fix crash if output has no margins

    fix crash if output has no margins

    resolves #210 ST crashes on line 255.

    The problem must be with rounding rectangles in float coordinates to rectangles in integer coordinates. In some circumstances it rounds to -1 pixel for output rect and to +1 pixel for working rect. Working rect size becomes 1 pixel bigger than output page size and everything collapses. And non-zero margins seems to save ST from that. toAlignedRect() is using ceil() instead of round() for float coordinates conversion so the sizes should always match.

    opened by trufanov-nok 1
  • Fix compilation with GCC 6

    Fix compilation with GCC 6

    GCC 6 defaults to a newer C++ standard version. C++11 introduced a new overload for push_back so it is now sometimes necessary to specify which overload is required. C++11 also introduced std::bind so we need to specify the namespace when using boost::lambda::bind.

    opened by jascrain 1
  • respect CFLAGS and CXXFLAGS

    respect CFLAGS and CXXFLAGS

    Setting CMAKE_C_FLAGS and CMAKE_CXX_FLAGS without including their existing contents causes any provided CFLAGS and/or CXXFLAGS to be ignored.

    These should be respected, if set (distributions commonly have defaults they expect to be used). The $default_flags_ tweak is just to avoid duplication - distros will typically set CXXFLAGS to include all the same stuff as CFLAGS, so if you include CFLAGS in CMAKE_C_FLAGS then include CMAKE_C_FLAGS in CMAKE_ CXX_FLAGS, you'll wind up with lots of flags repeated.

    opened by AdamWill 1
  • Copy Featured branch into main scantailor project

    Copy Featured branch into main scantailor project

    Hi,

    I couldn't find a repository for the "featured" branch of scantailor, so I've attempted to recreate it based on the tarballs I had available, specifically:

    scantailor-featured-2013.02.15.tar.gz scantailor-featured-2013.04.10.tar.gz scantailor-featured-2013.05.31.tar.gz

    I've tried to make it one commit per "feature" as listed at http://sourceforge.net/projects/scantailor/files/scantailor-devel/featured/ but in some cases there were changes to a feature after the first release containing that feature, so there might be a second commit later on. They're all labelled with the feature name in the commit in any case. I've tagged the commits corresponding to the releases too:

    scantailor-featured_2013.02.15 scantailor-featured_2013.04.10 scantailor-featured_2013.05.31

    I don't suggest merging this into master, rather that it'd be good to copy it over to the main scantailor repository to make it easier to cherry-pick any of the features for the main branch.

    Thanks, Andy

    opened by rockclimb 1
  • Enhanced branch upd

    Enhanced branch upd

    These 3 commits were "cherry-picked" from master branch and applied to enhanced branch. They are required to build vanilla enhanced branch with up-to-date toolchain (boost, gcc). The description of a last one commit is a bit spoiled.

    opened by trufanov-nok 0
  • Tiff compression may be changed in settings file

    Tiff compression may be changed in settings file

    This PR addresses #201 and demonstrates settings file usage. It contains 3 commits:

    1. Enforcing settings storage as ini file among platforms. Which is also submitted standalone in #266
    2. Fancy static helper class that allows easily call callback function at application start to make sure settings file contains hints. As Qt settings file implementation strip outs comments.
    3. Tiff compression change implementation based on settings file. 2 commits above allows to implement it with only changing one tiff cpp module.

    As for setting itself, you'll find following lines in Scan Tailor.ini:

    [tiff]
    compressionMethod-hint="Tiff compression method may take following values: NONE, CCITTRLE, CCITTFAX3, CCITT_T4, CCITTFAX4, CCITT_T6, LZW, OJPEG, JPEG, T85, T43, NEXT, CCITTRLEW, PACKBITS, THUNDERSCAN, IT8CTPAD, IT8LW, IT8MP, IT8BL, PIXARFILM, PIXARLOG, DEFLATE, ADOBE_DEFLATE, DCS, JBIG, SGILOG, SGILOG24, JP2000, LZMA. Default value: LZW. Note: not all methods may be implemented by libtiff. The error messages are printed to console."
    compressionMethod=LZW
    

    I've tried some values - NONE and DEFLATE works. Several, like JP2000, wasn't implemented in my system. JPEG complains on incompatible image settings set up. The error messages are shown in console. Scan Tailor's GUI show no error dialogs in case of problems - just stops doing anything without refreshing page thumbnail image.

    opened by trufanov-nok 0
  • Fix build with boost 1.60

    Fix build with boost 1.60

    Always use fully qualified boost::lambda::{bind,_1,_2}

    With boost 1.60 there's a namespace conflict between boost::bind and boost::lambda::bind and placeholders are no longer in global namespace, so use fully qualified names for these.

    It is advised to switch away from "using namespace" statements completely.

    opened by AMDmi3 0
  • Fix some German translations.

    Fix some German translations.

    Fixes for some German translations, especially

    • keyboard shortcuts. They were translated, which meant they did not work.
    • “Every other page” was translated as „alle anderen Seiten“. That was wrong. It translates back as “All other pages“, meaning all pages, with the exception of this page. Not what this does.

    There are some other strings i came across that i translated.

    opened by ospalh 0
  • Remove uneccessary call to toGrayscale

    Remove uneccessary call to toGrayscale

    Remove uneccessary call to toGrayscale as darkestGrayLevel() is always used with GrayImage as argument. So no need to create QImage. Just making sure GrayImage is passed to darkestGrayLevel() by changing type of argument in function declaration.

    opened by trufanov-nok 0
  • fixed incorrect link for

    fixed incorrect link for "Heroes and Cobblestones"

    Corrected the link to the book "Heroes and Cobblestones". The initial link "http://books.google.com/books?printsec=frontcover&id=o4Q2OlVl61MC" was erroneously pointing to "Oakland Township: Two Hundred Years". The correct link is "http://books.google.com.ng/books?printsec=frontcover&id=sQj6XPKB6ZAC". You can check it out :)

    opened by El-Nazy 0
  • Fix detection of second chance components

    Fix detection of second chance components

    That seems to be a bug. I've tested and found that there are pages when have_anchored_to_small_but_not_big got true but then was reset to false. After the proposed fix the despeckle results are more accurate: 3

    opened by trufanov-nok 0
  • Generate and upload AppImage

    Generate and upload AppImage

    This PR, when merged, will compile this application on Travis CI upon each git push, and upload an AppImage to your GitHub Releases page.

    Providing an AppImage would have, among others, these advantages:

    • Applications packaged as an AppImage can run on many distributions (including Ubuntu, Fedora, openSUSE, CentOS, elementaryOS, Linux Mint, and others)
    • One app = one file = super simple for users: just download one AppImage file, make it executable, and run
    • No unpacking or installation necessary
    • No root needed
    • No system libraries changed
    • Works out of the box, no installation of runtimes needed
    • Optional desktop integration with appimaged
    • Optional binary delta updates, e.g., for continuous builds (only download the binary diff) using AppImageUpdate
    • Can optionally GPG2-sign your AppImages (inside the file)
    • Works on Live ISOs
    • Can use the same AppImages when dual-booting multiple distributions
    • Can be listed in the AppImageHub central directory of available AppImages
    • Can double as a self-extracting compressed archive with the --appimage-extract parameter
    • No repositories needed. Suitable/optimized for air-gapped (offline) machines

    Here is an overview of projects that are already distributing upstream-provided, official AppImages.

    PLEASE NOTE: For this to work, you need to set up GITHUB_TOKEN in Travis CI for this to work; please see https://github.com/probonopd/uploadtool.

    If you have questions, AppImage developers are on #AppImage on irc.freenode.net.

    opened by probonopd 3
Releases(RELEASE_0_9_12_1)
Give a solution to recognize MaoYan font.

猫眼字体识别 该 github repo 在于帮助xjtlu的同学们识别猫眼的扭曲字体。已经打包上传至 pypi ,可以使用 pip 直接安装。 猫眼字体的识别不出来的原理与解决思路在采茶上 使用方法: import MaoYanFontRecognize

Aruix 4 Jun 30, 2022
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

basic-dataset-generator-from-image-of-numbers generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv inpu

1 Jan 01, 2022
A simple python program to record security cam footage by detecting a face and body of a person in the frame.

SecurityCam A simple python program to record security cam footage by detecting a face and body of a person in the frame. This code was created by me,

1 Nov 08, 2021
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

Harshit Bhalla 6 Jul 11, 2022
Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

ViCoS Lab 169 Dec 30, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
Deskewing images with slanted content

skew_correction De-skewing images with slanted content by finding the deviation using Canny Edge Detection. To Run: In python 3.6, from deskew import

13 Aug 27, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
Application that instantly translates sign-language to letters.

Sign Language Translator Project Description The main purpose of project is translating sign-language to letters. In accordance with this purpose we d

3 Sep 29, 2022
PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

About PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV) Colorizor Приложение для проекта Yand

1 Apr 04, 2022
The first open-source library that detects the font of a text in a image.

Typefont Typefont is an experimental library that detects the font of a text in a image. Usage Import the main function and invoke it like in the foll

Vasile Pește 1.6k Feb 24, 2022
Sort By Face

Sort-By-Face This is an application with which you can either sort all the pictures by faces from a corpus of photos or retrieve all your photos from

0 Nov 29, 2021
This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe libraries.

CVZone This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe librar

CVZone 648 Dec 30, 2022
a micro OCR network with 0.07mb params.

MicroOCR a micro OCR network with 0.07mb params. Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 8,

william 29 Aug 06, 2022
https://arxiv.org/abs/1904.01941

Character-Region-Awareness-for-Text-Detection- https://arxiv.org/abs/1904.01941 Train You can train SynthText data use python source/train_SynthText.p

DayDayUp 120 Dec 28, 2022
A simple demo program for using OpenCV on Android

Kivy OpenCV Demo A simple demo program for using OpenCV on Android Build with: buildozer android debug deploy run Run (on desktop) with: python main.p

Andrea Ranieri 13 Dec 29, 2022
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 17 Aug 25, 2022