Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Overview

gosseract OCR

Runtime Test codecov Go Report Card License: MIT Go Reference

Golang OCR package, by using Tesseract C++ library.

OCR Server

Do you just want OCR server, or see the working example of this package? Yes, there is already-made server application, which is seriously easy to deploy!

👉 https://github.com/otiai10/ocrserver

Example

package main

import (
	"fmt"
	"github.com/otiai10/gosseract/v2"
)

func main() {
	client := gosseract.NewClient()
	defer client.Close()
	client.SetImage("path/to/image.png")
	text, _ := client.Text()
	fmt.Println(text)
	// Hello, World!
}

Install

  1. tesseract-ocr, including library and headers
  2. go get -t github.com/otiai10/gosseract

Check Dockerfile for more detail of installation, or you can just try by docker run -it --rm otiai10/gosseract.

Test

In case you have tesseract-ocr on your local, you can just hit

% go test .

Otherwise, if you DON'T want to install tesseract-ocr on your local, kick ./test/runtime which is using Docker and Vagrant to test the source code on some runtimes.

% ./test/runtime --driver docker
% ./test/runtime --driver vagrant

Check ./test/runtimes for more information about runtime tests.

Issues

Comments
  • Installation Failure on Windows 7

    Installation Failure on Windows 7

    Summary

    Installation Failure on Windows 7 λ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: tesseract/baseapi.h: No such file or directory #include <tesseract/baseapi.h> ^~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    Reproducibility

    Yes

    Reproducility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    1. Install GO lan
    2. Install GCC (64 Bit Compiler)
    3. Install GIT
    4. Install Tesseract from this site https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe
    5. Execute λ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: tesseract/baseapi.h: No such file or directory #include <tesseract/baseapi.h> ^~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    Environment

    Windows 7

    uname -a
    
    go env
    

    C:\Users\33133 λ go env set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\33133\AppData\Local\go-build set GOEXE=.exe set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOOS=windows set GOPATH=C:\Users\33133\go set GORACE= set GOROOT=C:\Go set GOTMPDIR= set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64 set GCCGO=gccgo set CC=gcc set CXX=g++ set CGO_ENABLED=1 set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\33133\AppData\Local\Temp\go-build238513982=/tmp/go-build -gno-record-gcc-switches

    C:\Users\33133 λ

    go version
    

    λ go version go version go1.10.3 windows/amd64

    tesseract --version
    

    C:\Program Files (x86)\Tesseract-OCR>tesseract --version tesseract 3.05.02 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0. 9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0

    C:\Program Files (x86)\Tesseract-OCR>

    opened by vsarin 27
  • 'tesseract/baseapi.h' file not found

    'tesseract/baseapi.h' file not found

    % go test ./...
    # github.com/otiai10/gosseract/tesseract
    tesseract/tess.cpp:1:10: fatal error: 'tesseract/baseapi.h' file not found
    FAIL    github.com/otiai10/gosseract [build failed]
    
    question 
    opened by otiai10 25
  • Keep printing a blank with no error

    Keep printing a blank with no error

    package main

    import ( "fmt" "github.com/otiai10/gosseract" )

    func main() { client := gosseract.NewClient() defer client.Close() client.SetImage("path/to/image.png") text, _ := client.Text() fmt.Println(text) // Hello, World! }

    I am using this code and run it in docker but still getting a blank without error

    need more information 
    opened by gradygabriel10 10
  • tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory

    tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory

    Summary

    Installation failed on win 10 x64 by go get -t github.com/otiai10/gosseract $ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory #include <leptonica/allheaders.h> ^~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    I don't know about header files,. How do I install them on windows? leptonica/allheaders.h This is header files?

    Reproducibility

    Reproducibility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    Install GO lan Install GCC (64 Bit Compiler) Install GIT Install Tesseract from this site https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe go get -t github.com/otiai10/gosseract

    Environment

    
    
    go env
    

    $ go env set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\VULCAN\AppData\Local\go-build set GOENV=C:\Users\VULCAN\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\Go\go\bin set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=C:\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\VULCAN\AppData\Local\Temp\go-build610875910=/tmp/go-build -gno-record-gcc-switches

    go version
    

    $ go version go version go1.13.5 windows/amd64

    tesseract --version
    ```$ tesseract --version
    tesseract 3.05.02
     leptonica-1.75.3
      libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
    
    
    opened by QQ3544291 10
  • Leaks /dev/ttysXXX file handles even when Close() is manually called

    Leaks /dev/ttysXXX file handles even when Close() is manually called

    Summary

    This is a real issue for me as I am capturing images via security camera and every frame runs through Tesseract in real time.

    While running in a loop over files, OpenCV video frame, or as a web service, gosseract is opening a new /dev/ttys002 (or /dev/pts/004) every time an image is parsed. This eventually leads to a situation of running out of allowed file handlers.

    I have attached the example Go projects that have the issue and a C++ version that does not.

    lsof screenshot

    Reproducibility

    Always

    Environment

    macOS 10.14, Tesseract 4.1.0 Ubuntu 19.01, Tesseract 4.0

    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/Users/tbruno/Library/Caches/go-build"
    GOENV="/Users/tbruno/Library/Application Support/go/env"
    GOEXE=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="darwin"
    GOPATH="/Users/tbruno/Projects/GolandProjects/go"
    GOPRIVATE=""
    GOPROXY="https://proxy.golang.org,direct"
    GOROOT="/usr/local/Cellar/go/1.13.4/libexec"
    GOSUMDB="sum.golang.org"
    GOTMPDIR=""
    GOTOOLDIR="/usr/local/Cellar/go/1.13.4/libexec/pkg/tool/darwin_amd64"
    GCCGO="gccgo"
    AR="ar"
    CC="clang"
    CXX="clang++"
    CGO_ENABLED="1"
    GOMOD=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/2r/vy9wb4w90snd6wwv06ts6rth0000gn/T/go-build449183315=/tmp/go-build -gno-record-gcc-switches -fno-common"
    
    go version go1.13.4 darwin/amd64
    
    tesseract 4.1.0
     leptonica-1.78.0
      libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1
     Found AVX512BW
     Found AVX512F
     Found AVX2
     Found AVX
     Found SSE
    

    Examples (I'm using the same jpg file, but this happens even if a new file is opened and also happens with SetImageFromBytes)

    func main() {
    	client := gosseract.NewClient()
    	for {
    		client.SetImage("/Users/tbruno/test.jpg")
    		text, _ := client.Text()
    		fmt.Println(text)
    	}
    	client.Close()
    }
    
    func main() {
    	for {
    		client := gosseract.NewClient()
    		client.SetImage("/Users/tbruno/test.jpg")
    		text, _ := client.Text()
    		fmt.Println(text)
    		client.Close()
    	}
    }
    

    TessTester-ClientInLoopGo.zip TessTester-SingleClientGo.zip TessApi-NoLeakCpp.zip

    bug 
    opened by tebruno99 10
  • Fix/add tessdata prefix

    Fix/add tessdata prefix

    Hi.

    I've added an ability to provide different TessdataPrefix directly from go code with default value equal to environment TESSDATA_PREFIX. Requesting for a review, thanks.

    Seems like my solution only works with latest tesseract and only on linux (different was not tested). We should somehow define default directory for models for different tesseract versions.

    opened by awskii 9
  • Init only when required (perfs)

    Init only when required (perfs)

    I benchmarked my app and seen that 90% of the CPU time is lost in "init()".

    With this code I keep the instance open and perform multiple recognition on it, if a configuration change requires to init again, I flag the instance to rerun init

    What do you think about it?

    Details

    This is a typical use of gosseract to extract text, in a sample program (profiling included):

    package main
    
    import (
        "bytes"
        "image/png"
    
        "gocv.io/x/gocv"
        "github.com/openrm/gosseract"
        "github.com/pkg/profile"
    )
    
    func GetTextFromImage(img *gocv.Mat, client *gosseract.Client) (string, error) {
        buf := new(bytes.Buffer)
        finalImage, err := img.ToImage()
        png.Encode(buf, finalImage)
    
        client.SetImageFromBytes(buf.Bytes())
        client.SetPageSegMode(gosseract.PSM_SINGLE_BLOCK)
    
        out, err := client.Text()
    
        if err != nil {
          return "", err
        }
    
        return out, nil
    }
    
    func main() {
        defer profile.Start().Stop()
    
        client := gosseract.NewClient()
        defer client.Close()
    
        client.Languages = []string{"jpn"}
    
        img := gocv.IMRead("1.png", gocv.IMReadColor)
    
        for i := 0; i < 20; i++ {
            GetTextFromImage(&img, client)
        }
    }
    

    With the code above, I get the following result with go profiling: result1

    As you can see, over the 12 seconds spent in the program, 11 are caused by repeated calls to init.

    With the proposed changes in this PR, the profiling is now like this: result2

    Notes

    • SetConfigFile and SetLanguage cause the program to init again
    • SetWhitelist, SetBlacklist, DisabledOutput and SetVariable make internal call to setVariablesToInitializedAPI if init has already been called
    opened by PuKoren 9
  • cannot find package

    cannot find package "github.com/otiai10/gosseract/v2"

    Hello, I have installed the package using go get github.com/otiai10/gosseract and imported it in my package: "github.com/otiai10/gosseract/v2" as per instructions.

    Summary

    I get this compile time error:

    vendor/app/shared/spamcheck/spamcheck.go:12:2: cannot find package "github.com/otiai10/gosseract/v2" in any of: /home/me/go/src/myapp/vendor/github.com/otiai10/gosseract/v2 (vendor tree) /usr/local/go/src/github.com/otiai10/gosseract/v2 (from $GOROOT) /home/me/go/src/github.com/otiai10/gosseract/v2 (from $GOPATH)

    Environment

    Ubuntu 18.08

    uname -a
    

    Linux pc5 5.3.0-28-generic #30~18.04.1-Ubuntu SMP Fri Jan 17 06:14:09 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

    go env
    

    GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/me/.cache/go-build" GOENV="/home/me/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/me/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build702738196=/tmp/go-build -gno-record-gcc-switches"

    go version
    

    go1.13.6 linux/amd64

    tesseract --version
    

    tesseract 4.0.0-beta.1 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0

    Found AVX2 Found AVX Found SSE

    Appreciate your help to fix this.

    opened by themrkumar 8
  • macOS compile freebsd binary file failed

    macOS compile freebsd binary file failed

    Summary

    I'm using macOS 10.14.6 and going to compile binary file for FreeBSD 11.3, and build failed, show message: undefined: gosseract.NewClient.

    Reproducibility

    Reproducibility Frequency

    100%

    Environment

    Darwin Kernel Version 18.7.0
    
    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/Users/frankb/Library/Caches/go-build"
    GOENV="/Users/frankb/Library/Application Support/go/env"
    GOEXE=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="darwin"
    GOPATH="/Volumes/home/Development files/Go files/"
    GOPRIVATE=""
    GOPROXY="https://proxy.golang.org,direct"
    GOROOT="/usr/local/go"
    GOSUMDB="sum.golang.org"
    GOTMPDIR=""
    GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
    GCCGO="gccgo"
    AR="ar"
    CC="clang"
    CXX="clang++"
    CGO_ENABLED="1"
    GOMOD=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/0b/ntd9fcqx6xn6nt3gpv4yndm40000gn/T/go-build712170545=/tmp/go-build -gno-record-gcc-switches -fno-common"
    
    go version go1.13 darwin/amd64
    
    tesseract 4.1.0
     leptonica-1.78.0
      libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1
     Found AVX2
     Found AVX
     Found SSE
    

    Source

    package main
    
    import (
    	"fmt"
    	"github.com/otiai10/gosseract"
    )
    
    func main() {
    	client := gosseract.NewClient()
    	defer client.Close()
    	client.SetLanguage("deu");
    	client.SetImage("test.png")
    	text, _ := client.Text()
    	fmt.Println(text)
    }
    

    Compile command

    env GOOS=freebsd GOARCH=amd64 go build ocrtest.go
    

    Error Message

    # command-line-arguments
    ./ocrtest.go:9:12: undefined: gosseract.NewClient
    
    opened by frankble 8
  • macOS complie linux binary file failed

    macOS complie linux binary file failed

    Summary

    I'm using macOS and going to compile binary file for centos7.1, and build failed, show message: undefined: gosseract.NewClient. Thanks.

    Reproducibility

    Reproducility Frequency

    100%

    Environment

    uname -a
    

    Darwin AllenChen-MacBookPro.local 18.2.0 Darwin Kernel Version 18.2.0: Fri Oct 5 19:41:49 PDT 2018; root:xnu-4903.221.2~2/RELEASE_X86_64 x86_64

    go env
    

    GOARCH="amd64" GOBIN="/Users/allen/Documents/go/bin" GOCACHE="/Users/allen/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/allen/Documents/go" GOPROXY="" GORACE="" GOROOT="/usr/local/go" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/s6/mqbnh2yn52b5glrvhk53jyxh0000gn/T/go-build032873897=/tmp/go-build -gno-record-gcc-switches -fno-common"

    go version
    

    go version go1.11 darwin/amd64

    tesseract --version
    

    tesseract 4.0.0 leptonica-1.76.0 libjpeg 9c : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 Found AVX Found SSE

    opened by czh0318 8
  • tesseract/baseapi.h: No such file or directory

    tesseract/baseapi.h: No such file or directory

    I'd like to use tesseract with go on Windows 7.

    During the installation process, as stated in the docs I execute

    c:\go\src\proj>go get github.com/otiai10/gosseract
    # github.com/otiai10/gosseract/tesseract
    C:\go\src\github.com\otiai10\gosseract\tesseract\tess.cpp:1:31: fatal error: tesseract/baseapi.h: No such file or directory
     #include <tesseract/baseapi.h>
                                   ^
    compilation terminated.
    

    And by searching the file system for the header file baseapi.h, I cannot find it.

    How can I solve this? Thank you

    question 
    opened by tobiassoltermann 8
  • gosseract finds no text where tesseract does

    gosseract finds no text where tesseract does

    Summary

    I am running tesseract and gosseract on the same image, a single line of text. Tesseract finds the text, gosseract does not.

    Reproducibility

    Reproducibility Frequency

    • 100%
    1. Run tesseract d2.pbm - --psm 13 and it will show the output
    2. Run go run main.go and it will not show any output

    go.mod:

    module gosstest
    
    go 1.19
    
    require github.com/otiai10/gosseract/v2 v2.4.0
    

    main.go:

    package main
    
    import (
    	"fmt"
    	"os"
    
    	"github.com/otiai10/gosseract/v2"
    )
    
    func main() {
    	const (
    		want     = "BPJAZGAP"
    		filename = "d2.pbm"
    	)
    	buf, err := os.ReadFile("d2.pbm")
    	if err != nil {
    		fmt.Fprintf(os.Stderr, "error reading %q: %v\n", filename, err)
    	}
    
    	fmt.Fprintln(os.Stderr, gosseract.Version())
    
    	ocr := gosseract.NewClient()
    	defer ocr.Close()
    	ocr.SetPageSegMode(gosseract.PSM_RAW_LINE) // --psm 13
    	ocr.SetImageFromBytes(buf)
    	got, err := ocr.Text()
    	if err != nil {
    		fmt.Fprintf(os.Stderr, "%v\n", err)
    	}
    
    	if want != got {
    		fmt.Fprintf(os.Stderr, "want %q but got %q", want, got)
    	}
    
    	fmt.Println(got)
    }
    

    d2.pbm:

    P1 42 8
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 1 1 1 0 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0 1 1 1 0 0 0
    0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0
    0 1 1 1 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0
    0 1 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0 1 1 1 0 0 0
    0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0
    0 1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 0 1 0 1 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    

    Environment

    Linux chieftec 6.0.15-300.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Dec 21 18:33:23 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
    
    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/home/jot/.cache/go-build"
    GOENV="/home/jot/.config/go/env"
    GOEXE=""
    GOEXPERIMENT=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
    GOINSECURE=""
    GOMODCACHE="/home/jot/go/pkg/mod"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="linux"
    GOPATH="/home/jot/go"
    [project.zip](https://github.com/otiai10/gosseract/files/10347142/project.zip)
    
    GOPRIVATE=""
    GOPROXY="direct"
    GOROOT="/usr/lib/golang"
    GOSUMDB="off"
    GOTMPDIR=""
    GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_amd64"
    GOVCS=""
    GOVERSION="go1.19.4"
    GCCGO="gccgo"
    GOAMD64="v1"
    AR="ar"
    CC="gcc"
    CXX="g++"
    CGO_ENABLED="1"
    GOMOD="/home/jot/work/gosseract/go.mod"
    GOWORK=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1563688176=/tmp/go-build -gno-record-gcc-switches"
    
    go version go1.19.4 linux/amd64
    
    tesseract 5.2.0
     leptonica-1.82.0
      libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.1.3) : libpng 1.6.37 : libtiff 4.4.0 : zlib 1.2.12 : libwebp 1.2.4
     Found AVX2
     Found AVX
     Found FMA
     Found SSE4.1
    
    opened by jhinrichsen 1
  • CI on Windows

    CI on Windows

    • https://github.com/otiai10/gosseract/issues/251
    • https://github.com/otiai10/gosseract/issues/200
    • https://github.com/otiai10/gosseract/issues/240
    • https://github.com/otiai10/gosseract/issues/199
    • https://github.com/otiai10/gosseract/issues/132
    • https://github.com/otiai10/gosseract/issues/234
    • https://github.com/otiai10/gosseract/issues/215
    • https://github.com/otiai10/gosseract/issues/233
    • https://github.com/otiai10/gosseract/issues/223
    • https://github.com/otiai10/gosseract/issues/226
    • and more
    opened by otiai10 0
  • Win11 compiler error

    Win11 compiler error

    This text is generated based on ISSUE_TEMPLATE.md. The issue reporter must read and remove this block before submitting.

    Summary

    Go Compilation Error (in tessbridge.cpp:5): fatal error: leptonica/allheaders.h: No such file or directory

    Reproducibility

    Reproducibility Frequency

    • XX%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    1. foo bar
    2. spam ham
    3. hoge fuga

    Environment

    uname -a
    

    Windows 11

    go env
    

    set GO111MODULE=on set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\Administrator\AppData\Local\go-build set GOENV=C:\Users\Administrator\AppData\Roaming\go\env set GOEXE=.exe set GOEXPERIMENT= set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=E:\Go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=E:\Go set GOPRIVATE= set GOPROXY=https://goproxy.cn,direct set GOROOT=D:\Program Files\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=D:\Program Files\Go\pkg\tool\windows_amd64 set GOVCS= set GOVERSION=go1.18.3 set GCCGO=gccgo set GOAMD64=v1 set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=E:\Go\src\ERMS\go.mod set GOWORK= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\ADMINI~1\AppData\Local\Temp\go-build2580212505=/tmp/go-build -gno-record-gcc-switches

    go version
    

    go version go1.18.3 windows/amd64

    tesseract --version
    

    tesseract v5.1.0.20220510 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 Found AVX512BW Found AVX512F Found AVX2 Found AVX Found FMA Found SSE4.1 Found libarchive 3.5.0 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5 libzstd/1.4.5 Found libcurl/7.77.0-DEV Schannel zlib/1.2.11 zstd/1.4.5 libidn2/2.0.4 nghttp2/1.31.0

    opened by zzdboy 0
  • add a finalizer to close the client

    add a finalizer to close the client

    If the developer forgets to call the close method after creating the client, it will cause a memory leak.

    To avoid this, I refer to the method in os.File. By adding a finalizer, the Close method will be called when the client is unreachable and the developer haven't call the Close method neither.

    Test

    client.go

    // NewClient construct new Client. It's due to caller to Close this client.
    func NewClient() *Client {
    	client := &Client{
    		api:        C.Create(),
    		Variables:  map[SettableVariable]string{},
    		Trim:       true,
    		shouldInit: true,
    		Languages:  []string{"eng"},
    	}
    	// set a finalizer to close the client when it's unused and not closed by the user
    	runtime.SetFinalizer(client, (*Client).Close)
    	return client
    }
    
    // Close frees allocated API. This MUST be called for ANY client constructed by "NewClient" function.
    func (client *Client) Close() (err error) {
    	// defer func() {
    	// 	if e := recover(); e != nil {
    	// 		err = fmt.Errorf("%v", e)
    	// 	}
    	// }()
    	fmt.Println("Closed")
    	C.Clear(client.api)
    	C.Free(client.api)
    	if client.pixImage != nil {
    		C.DestroyPixImage(client.pixImage)
    		client.pixImage = nil
    	}
    	// no need for a finalizer anymore
    	runtime.SetFinalizer(client, nil)
    	return err
    }
    

    test code

    func main() {
    	runGgosseract()
    	runtime.GC() // run a garbage collection
    	time.Sleep(2 * time.Second)
    	// see "Close" before "exit"
    	fmt.Println("exit")
    }
    
    func runGgosseract() {
    	client := gosseract.NewClient()
    	client.SetImage("path/to/image.png")
    	text, _ := client.Text()
    	fmt.Println(text)
    }
    
    opened by yin1999 1
  • Fix the docker build to download project source files

    Fix the docker build to download project source files

    The gosseract source files were not being downloaded during the Docker build process so the go test step was failing. Setting the environment variable fixes the issue and allows correct building of the docker image.

    opened by mrisher23 1
  • failed to initialize TessBaseAPI with code -1:

    failed to initialize TessBaseAPI with code -1:

    This text is generated based on ISSUE_TEMPLATE.md. The issue reporter must read and remove this block before submitting.

    Summary

    • I install gosseract at win10 . use MSYS2 with Mingw64 to install tesseract , leptonica module。 and finally install gosseract use command 'go get -t go get -t github.com/otiai10/gosseract/v2 '

    when i build my test project success , eventually throw a exception : `failed to initialize TessBaseAPI with code -1: '

    then I go to the install directory at go path , run the go test , also get the same error : all_test.go:144 Expected to be <nil> But actual failed to initialize TessBaseAPI with code -1:

    I can do nothing , becasue there are haven't any message with that code . pls help ! 3q !

    Reproducibility

    Reproducibility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    Environment

    win10

    uname -a

    
    

    go env

    go env

    set GO111MODULE=auto set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\langxli\AppData\Local\go-build set GOENV=C:\Users\langxli\AppData\Roaming\go\env set GOEXE=.exe set GOEXPERIMENT= set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=C:\Users\langxli\go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\Users\langxli\go;D:\work\brick\brick_app_project set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=C:\Program Files\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64 set GOVCS= set GOVERSION=go1.17 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=C:\Users\langxli\go\pkg\mod\github.com\otiai10\gosseract\[email protected]\go.mod set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=D:\msys64\tmp\go-build2794601340=/tmp/go-build -gno-record-gcc-switches

    go version
    # go version
    go version go1.17 windows/amd64
    
    
    

    tesseract --version

    tesseract --version

    tesseract 4.1.1 leptonica-1.81.1 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.0.6) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0

    opened by langxlm 2
Releases(v2.3.1)
A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are comp

Pavankumar Khot 4 Mar 19, 2022
pulse2percept: A Python-based simulation framework for bionic vision

pulse2percept: A Python-based simulation framework for bionic vision Retinal degenerative diseases such as retinitis pigmentosa and macular degenerati

67 Dec 29, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
Maze generator and solver with python

Procedural-Maze-Generator-Algorithms Check out my youtube channel : Auctux Ressources Thanks to Jamis Buck Book : Mazes for programmers Requirements P

Joseph 19 Dec 07, 2022
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
Deskewing images with slanted content

skew_correction De-skewing images with slanted content by finding the deviation using Canny Edge Detection. To Run: In python 3.6, from deskew import

13 Aug 27, 2022
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 05, 2022
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
Fine tuning keras-ocr python package with custom synthetic dataset from scratch

OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound

Eugene 1 Jan 05, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv

1 Feb 13, 2022
Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

isearch is an OSINT tool on Instagram. Offers a face recognition reverse image search on Instagram profile feed photos.

Malek salem 20 Oct 25, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

Harald Scheidl 1.5k Jan 07, 2023
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022
Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

Michiel van Wessem 1 Nov 15, 2021
Automatically download multiple papers by keywords in CVPR

CVFPaperHelper Automatically download multiple papers by keywords in CVPR Install mkdir PapersToRead cd PaperToRead pip install requests tqdm git clon

46 Jun 08, 2022