Layout Parser is a deep learning based tool for document image layout analysis tasks.

Overview

Layout Parser Logo

Docs PyPI PyVersion License


Layout Parser is a deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}
Comments
  • Apply detect() on readable PDF files

    Apply detect() on readable PDF files

    Hi there, from the docs I infere that detect() operates, for example, on PIL.Image objects. Is there way to directly operate on already readable PDF files (which obviates the need applying OCR as well). Greetings

    enhancement 
    opened by simonschoe 12
  • AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Hi,

    Thank you for this awesome program! I successfully installed layout-parser Detectron2 on my windows 10 laptop. When I run the following code:

    import layoutparser as lp import cv2 from pdf2image import convert_from_bytes

    images = convert_from_bytes(open('C:\temp\ConsigneeList\Doc 4 Distribution List.pdf', 'rb').read())

    model = lp.Detectron2LayoutModel( config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) #loop through each page for image in images: ocr_agent = lp.ocr.TesseractAgent()

    image = np.array(image)
    
    layout = model.detect(image)
    

    text_blocks = lp.Layout([b for b in layout if b.type == 'Text']) #loop through each text box on page.

    for block in text_blocks: segment_image = (block .pad(left=5, right=5, top=5, bottom=5) .crop_image(image)) text = ocr_agent.detect(segment_image) block.set(text=text, inplace=True)

    for i, txt in enumerate(text_blocks.get_texts()):
            my_file = open("OUTPUT FILE PATH/FILENAME.TXT","a+")
            my_file.write(txt)
    

    I get the following errors:


    AttributeError Traceback (most recent call last) in ----> 1 model = lp.Detectron2LayoutModel( 2 config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog 3 label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map 4 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional 5 )

    C:\ProgramData\Anaconda3\lib\site-packages\layoutparser\file_utils.py in getattr(self, name) 224 value = getattr(module, name) 225 else: --> 226 raise AttributeError(f"module {self.name} has no attribute {name}") 227 228 setattr(self, name, value)

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Any ideas on what is wrong? Thank you!!

    Sincerely,

    tom

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version, see the Layout Parser Releases

    To Reproduce Steps to reproduce the behavior:

    1. What command or script did you run?
    A placeholder for the command.
    

    Environment

    1. Please describe your Platform [Windows/MacOS/Linux]
    2. Please show the Layout Parser version
    3. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error traceback here.

    Screenshots If applicable, add screenshots to help explain your problem.

    Additional context Add any other context about the problem here.

    bug 
    opened by theiman112860 10
  • 'GCVAgent' object has no attribute '_client'

    'GCVAgent' object has no attribute '_client'

    Hi, when I was running the tutorial of "OCR tables and parse the output", when I was trying to obtain the result:

    res = ocr_agent.detect(image, return_response=True)

    The response was

    Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 168, in detect res = self._detect(img_content) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 134, in _detect response = self._client.document_text_detection( AttributeError: 'GCVAgent' object has no attribute '_client'

    I googled and some sites said The Client() class was removed in the Client Library v0.25.1 and replaced with ImageAnnotatorClient().

    Was this a problem? Thank you.

    bug 
    opened by junxi-liu 8
  • Error installing dependencies

    Error installing dependencies

    Hi Team, Thank you for all the great work. It looks amazing. I tried installing pip install layoutparser but it thrown me the below error, can you please let me know how to rectify this,

    ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-awmfv0cr' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (22 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext cythoning pycocotools/_mask.pyx to pycocotools_mask.c C:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\pycocotools_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2

    ERROR: Failed building wheel for pycocotools ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\pss.ch\AppData\Roaming\Python\Python38\Include\pycocotools' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (20 lines): running install running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext skipping 'pycocotools_mask.c' Cython extension (up-to-date) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\AppData\Roaming\Python\Python38\Include\pycocotools' Check the logs for full command output.

    opened by sriprad 8
  • enforce_cpu not working

    enforce_cpu not working

    When setting enforce_cpu true, still using CUDA instead of CPU. I think it is due to this https://github.com/Layout-Parser/layout-parser/blob/e035fc8f952addc620670e5b47864fe213db0e10/src/layoutparser/models/layoutmodel.py#L120

    Possible fix could be cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() and (not enforce_cpu) else "cpu"

    bug 
    opened by lkluo 5
  • Adding support for mathematical formula recognition

    Adding support for mathematical formula recognition

    Have you considered adding support for mathematical formula recognition? Identifying the position of mathematical formulas in documents has always been a problem.

    modeling 
    opened by SleepyCelery 5
  • draw_box draw only one box from layout

    draw_box draw only one box from layout

    Describe the bug I just installed everything according to the installation guide and launched your jupyter notebook from here Deep Layout Parsing Example. After first draw_box it's show only one box, but in print(layout) i see all boxes. Same with second draw_box from your guide. not sure what i'm doing wrong.

    To Reproduce Steps to reproduce the behavior:

    1. installation guide + detectron2 install also from your guide
    2. Run jupyter notebook

    Environment

    1. MacOS
    2. VS Code
    3. Here some stuff from pip:
    torch==1.11.0
    torchvision==0.12.0
    Pillow==9.1.0
    opencv-python==4.5.5.64
    layoutparser==0.3.3
    

    Error traceback No errors, just behaviour not same like in guide or other guides

    Screenshots attached

    output1 output2

    bug 
    opened by Moo1234567 4
  • Gives wrong results when the code is run for some images in a loop

    Gives wrong results when the code is run for some images in a loop

    The code works when it is run for a single image. But when I run the same code in a loop for few images from the publaynet dataset, cached results seem to apply (i.e. The bounding boxes overlap and the boxes for the previous images are also put in the current image).

    opened by surajsubramanian 4
  • ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    While using this code, I get this error of Pillow. I tried re-installing pillow but still struggling with this issue. Any help to make this code run?

    import layoutparser as lp
    model = lp.Detectron2LayoutModel(
                config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
                label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
                extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
            )
    model.detect(image)
    

    Getting this error:

    ImportError                               Traceback (most recent call last)
    [<ipython-input-6-59f0fb07b7e3>](https://localhost:8080/#) in <module>
          1 import layoutparser as lp
    ----> 2 model = lp.Detectron2LayoutModel(
          3             config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
          4             label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
          5             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
    
    31 frames
    [/usr/local/lib/python3.7/dist-packages/PIL/ImageFont.py](https://localhost:8080/#) in <module>
         35 from . import Image
         36 from ._deprecate import deprecate
    ---> 37 from ._util import is_directory, is_path
         38 
         39 
    
    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)
    
    
    opened by arhamshah 3
  • TypeError: inner() got an unexpected keyword argument 'image_context'

    TypeError: inner() got an unexpected keyword argument 'image_context'

    Hello! Recently encountered an issue when trying to use Google's OCR when running ocr_agent.detect

    Running this:

    image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
    ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    res = ocr_agent.detect(image, return_response=True)
    

    Gives me the following error:

    TypeError                                 Traceback (most recent call last)
    <ipython-input-9-76614ef6a3e8> in <module>
          1 image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
          2 ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    ----> 3 res = ocr_agent.detect(image, return_response=True)
          4 
          5 #layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in detect(self, image, return_response, return_only_text, agg_output_level)
        222                 img_content = image_file.read()
        223 
    --> 224         res = self._detect(img_content)
        225 
        226         if return_response:
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in _detect(self, img_content)
        188     def _detect(self, img_content):
        189         img_content = self._vision.types.Image(content=img_content)
    --> 190         response = self._client.document_text_detection(
        191             image=img_content, image_context=self._context
        192         )
    
    TypeError: inner() got an unexpected keyword argument 'image_context'
    

    Not sure what it is caused by, might be user error but I haven't been able to find anything else about it and I've tried everything I can think of (all the packages are up to date (or in google cloud vision's case, downgraded to stay on the old API). Thanks!

    bug 
    opened by liz-goodwin 3
  • bad result detected

    bad result detected

    I got bad result using layout-parser here is the image I am used: 1

    here is the code run in python :

    image = cv2.imread("1.png")
    # Convert the image from BGR (cv2 default loading style)
    # to RGB
    image = image[..., ::-1]
    origin_image = image.copy()
    
    model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
    # Load the deep layout model from the layoutparser API 
    # For all the supported model, please check the Model 
    # Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
    
    layout = model.detect(image)
    # print("layout : ", layout)
    # Detect the layout of the input image
    text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
    drawRectangleInImage(origin_image, text_blocks, (36,255,12))
    
    titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
    drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))
    
    figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
    drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))
    
    lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
    drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))
    
    tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
    drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))
    
    cv2.imshow('image', origin_image)
    cv2.waitKey()
    

    here is the result:

    截屏2022-01-18 11 45 06

    by the way :

    there is some warning generated :

    /usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

    bug 
    opened by DamonsJ 3
  • Any idea about Detectron gets overlapping and sometimes misses some blocks

    Any idea about Detectron gets overlapping and sometimes misses some blocks

    The problem I am currently using layout-parser to detect the blocks of a scanned book papers and trying to take each block separately from the page and do some processing over them.

    Checklist

    To Reproduce

    import layoutparser as lp
    import cv2
    
    image = cv2.imread("/content/image_0.jpg")
    # Convert the image from BGR (cv2 default loading style) to RGB
    image = image[..., ::-1]
    
    model = lp.Detectron2LayoutModel((lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config),
                                     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                     label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
    
    
    # Detect the layout of the input image
    layout = model.detect(image)
    
    # Show the detected layout of the input image
    lp.draw_box(image, layout, box_width=3)
    

    Environment

    1. Platform [Linux] (on colab)
    2. Installation commands
    !sudo apt-get update
    !sudo apt-get install libleptonica-dev tesseract-ocr libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
    !pip install layoutparser	
    !pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"	
    !pip install "layoutparser[ocr]"	
    !pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit 
    

    Screenshots

    1- Overlapping |3|image_3| |---|---|

    2- Missing |7|image_7| |---|---|

    I know it may not the right place to release that issue, but I think you may have an idea about that problem

    bug 
    opened by rrrokhtar 0
  • [Bug] has_torch_function_variadic error

    [Bug] has_torch_function_variadic error

    Describe the bug When attempting to initialise a model (I've tried with AutoLayoutModel and Detectron2LayoutModel), torch.jit throws a RuntimeError as below...

    RuntimeError: 
    undefined value has_torch_function_variadic:
      File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
             >>> loss.backward()
        """
        if has_torch_function_variadic(input, target, weight, pos_weight):
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            return handle_torch_function(
                binary_cross_entropy_with_logits,
    'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
      File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34
        """
        p = torch.sigmoid(inputs)
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        p_t = p * targets + (1 - p) * (1 - targets)
        loss = ce_loss * ((1 - p_t) ** gamma)
    

    To Reproduce Steps to reproduce the behavior:

    1. Install layout-parser, OpenCV, Detectron2 as below
    %pip install opensearch-py opencv-python --quiet
    %pip install -U layoutparser[ocr] --quiet
    !python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.10/index.html
    
    1. Import layoutparser and attempt to init model with lp.models.Detectron2LayoutModel(...)
    2. Error appears

    Environment Linux with layoutparser latest

    bug 
    opened by lucafrost 0
  • cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    Describe the bug When I tried the sample codes:

    !pip install layoutparser
    !pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2'
    
    import layoutparser as lp
    import cv2
    import PIL
    
    image = cv2.imread("image.png")
    model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')
    layout = model.detect(image)
    

    Colab link(Python 3.8.16): https://colab.research.google.com/drive/1lb8_Pcw8_NNdeKPL80HOYca8gaCB0f-E?usp=sharing

    I got an error on this line:

    lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')

    The error message is:

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.8/dist-packages/PIL/_util.py)

    I hope that I can get your help. Thanks!

    bug 
    opened by sudoghut 0
  • [Fix] reduce memory consumption and close pdf stream after usage

    [Fix] reduce memory consumption and close pdf stream after usage

    Flushes the pages and pdf afterwards to reduce the memory/ram consumption.

    Opens the pdf stream as a context manager so that the file is closed afterwads.

    opened by jakobnrmnn 0
  • Minor installation instruction error

    Minor installation instruction error

    On Mac, the command

    pip3 install -U layoutparser[ocr]
    

    doesn't work (returns "zsh: no matches found: layoutparser[ocr]"), you need to do

    pip3 install -U "layoutparser[ocr]"
    
    bug 
    opened by bholtdwyer 0
Releases(v0.3.4)
  • v0.3.4(Apr 6, 2022)

    Bug fixes

    • fix one critical bug for visualization mentioned in #131 by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/132

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.3...v0.3.4

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Apr 3, 2022)

    Functional Updates

    • Robust pdf loading for empty pages by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/115
    • fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @k-for-code in https://github.com/Layout-Parser/layout-parser/pull/95
    • Better layout comparison by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/128
    • Better visualization functions by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/129

    Example Updates

    • Minor update to Deep Learning Parser example notebook by @Jim-Salmons in https://github.com/Layout-Parser/layout-parser/pull/56
    • Set inplace to True in sorting function by @yusanshi in https://github.com/Layout-Parser/layout-parser/pull/104
    • Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/124

    New Contributors

    • @Jim-Salmons made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/56
    • @yusanshi made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/104
    • @k-for-code made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/95

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.2...v0.3.3

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Sep 23, 2021)

    Important fixes for multibackend layout model support:

    • Resolves the issues mentioned in #78 with other fixes to improve the multibackend layout model support #79
    • Better tests for different backends #79 for preventing future related issues
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Sep 15, 2021)

    • Fixes for automatically setting label_map in Detectron2LayoutModel #75
    • Remove unnecessary class annotations (that might breaks Python 3.6 users) #75
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Sep 13, 2021)

    We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.

    New Features

    • The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the layoutparser library, and makes it easier for implementing customized layout models in the future. #54 #67
    • Additionally, the newly added AutoModel and improved model configuration parsing makes it easier load and use the layout detection models. #69
      • e.g, model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet").
    • To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing layoutparser and the needed dependencies (see instructions). #65 #68
    • And now layoutparser supports directly loading PDF files into as layout objects: #71
      import layoutparser as lp
      pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True)
      lp.draw_box(pdf_images[0], pdf_layout[0])
      
    • To support more flexible processing of the layout objects, a set of new toolkits are available: #72
      import layout parser as lp
      page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0]
      pdf_lines = lp.simple_line_detection(page_layout)
      

    New Models

    • Add MFD model that can detect (display) equation regions within scientific documents #59
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 12, 2021)

    Layout Parser v0.2.0 Release Notes

    New Features

    1. Support for loading and exporting the layout data in json and csv , see #6
    2. Add support for union and intersect operations, see #20 and the detailed explanation

    Improvements

    1. Functional improvements:
      1. When loading Layout Parser official models, Detectron2LayoutModel can automatically detect the label_map, . For example,

        model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config")
        model.label_map
        # {1: 'Page Frame', ... }
        
      2. Detectron2LayoutModel now supports the enforce_cpu flag that enforces using cpu even when CUDA devices are available.

      3. For visualization.draw_box, it now supports a show_element_type flag that shows the bbox category name on the top left corner of the layout objects.

    2. Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25

    New Models

    1. Add the table bank detection models that can identify table regions

    Fixes

    1. Fix the incorrect layout issue mentioned in #9 - Thanks to @remidbs.
    2. Fix the some of the dependency issues mentioned in #11 and #13 by using iopath instead of fvcore. See #18, Thanks to @edisongustavo.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Dec 21, 2020)

    Improvements:

    • Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a Detectron2LayoutModel object. This might be helpful for using the plain layoutparser library without installing the Detectron2 module.

    New models:

    • Incorporated a pre-trained model based on the NewspaperNavigator dataset: lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config

    Fixes:

    • Corrected a bug in visualization that might overwrite original the image
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Oct 30, 2020)

    In this version, we released a new model for publaynet and made several improvements:

    1. We released the mask_rcnn_X_101_32x8d_FPN_3x model trained on the publaynet dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model.
    2. We improved the support for PIL images for both layout modeling and visualization
    3. We improved the Default Language Settings for the Tesseract OCR model
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jul 16, 2020)

    Fixes

    • Fixed a bug that could cause errors in loading Prima Models

    Updates

    • Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 24, 2020)

    layoutparser now supports the following functionalities:

    • Coordinate system:

      • Supports the 3 basic coordinate system and their geometric relationships
      • Supports the TextBlook and Layout system for convenient coordinate and text processing
    • OCR System:

      • Supports OCR based on Google Cloud Vision and Tesseract API.
    • Layout Modeling:

      • Supports using pre-trained Deep Learning models for layout object detection using Detection2
    • Visualization:

      • Supports highly-customizable presentation of the box coordinates and text in the detected layout
    Source code(tar.gz)
    Source code(zip)
A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

Antoine Dechaume 15 Dec 07, 2022
Some of the best ways and practices of doing code in Python!

Pythonicness ❤ This repository contains some of the best ways and practices of doing code in Python! Features Properly formatted codes (PEP 8) for bet

Samyak Jain 2 Jan 15, 2022
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.

solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli

Mega 3 Mar 03, 2022
A Material Design theme for MkDocs

A Material Design theme for MkDocs Create a branded static site from a set of Markdown files to host the documentation of your Open Source or commerci

Martin Donath 12.3k Jan 04, 2023
SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful.

Simple Codes That You Might Need SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful. In

CodeWriter21 2 Jan 21, 2022
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts

Have you always wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diff

Marc Wouts 5.7k Jan 04, 2023
FxBuzzly - Buzzly.art links do not embed in Discord, this fixes them (rudimentarily)

fxBuzzly Buzzly.art links do not embed in Discord, this fixes them (rudimentaril

Dania Rifki 2 Oct 27, 2022
An introduction to hikari, complete with different examples for different command handlers.

An intro to hikari This repo provides some simple examples to get you started with hikari. Contained in this repo are bots designed with both the hika

Ethan Henderson 18 Nov 29, 2022
Credit EDA Case Study Using Python

This case study aims to identify patterns which indicate if a client has difficulty paying their installments which may be used for taking actions such as denying the loan, reducing the amount of loa

Purvi Padliya 1 Jan 14, 2022
Repository for tutorials, examples and starter scripts for using the MTU HPC cluster

MTU-HPC-Starter Repository for tutorials, examples and starter scripts for using the MTU HPC cluster Connecting to the MTU HPC cluster Within the coll

1 Jan 31, 2022
Python-samples - This project is to help someone need some practices when learning python language

Python-samples - This project is to help someone need some practices when learning python language

Gui Chen 0 Feb 14, 2022
Lightweight, configurable Sphinx theme. Now the Sphinx default!

What is Alabaster? Alabaster is a visually (c)lean, responsive, configurable theme for the Sphinx documentation system. It is Python 2+3 compatible. I

Jeff Forcier 670 Dec 19, 2022
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

Patrick Kidger 243 Dec 20, 2022
Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Matheus Rodrigues 1 Jan 20, 2022
204-python-string-21BCA90 created by GitHub Classroom

204-Python This repository is created for subject "204 Programming Skill" Python Programming. This Repository contain list of programs of python progr

VIDYABHARTI TRUST COLLEGE OF BCA 6 Mar 31, 2022
ReStructuredText and Sphinx bridge to Doxygen

Breathe Packagers: PGP signing key changes for Breathe = v4.23.0. https://github.com/michaeljones/breathe/issues/591 This is an extension to reStruct

Michael Jones 643 Dec 31, 2022
Python Programming (Practical) (1-25) Download 👇🏼

BCA-603 : Python Programming (Practical) (1-25) Download zip 🙂 🌟 How to run programs : Clone or download this repo to your computer. Unzip (If you d

Milan Jadav 2 Jun 02, 2022
python package sphinx template

python-package-sphinx-template python-package-sphinx-template

Soumil Nitin Shah 2 Dec 26, 2022
Dev Centric Tools for Mkdocs Based Documentation

docutools MkDocs Documentation Tools For Developers This repo is providing a set of plugins for mkdocs material compatible documentation. It is meant

Axiros GmbH 14 Sep 10, 2022
Tips for Writing a Research Paper using LaTeX

Tips for Writing a Research Paper using LaTeX

Guanying Chen 727 Dec 26, 2022