This repository provides a set functions to extract paragraphs from AWS Textract responses.

Overview

extract-paragraphs-with-aws-textract

Since AWS Textract (the AWS OCR service) does not have a native function to extract paragraphs, this repository provides a set of Python 3.X functions built on top of the AWS Python SDK (boto3) to extract paragraphs from AWS Textract responses.

PLEASE NOTE THAT:

  1. It is assumed that your client has the neccesary IAM permissions to access the different AWS resources required.
  2. Since AWS Textract analyze PDF files by running asynchronous operations, the current version assumes that you've already created an s3 bucket and that the PDF files are already stored there. If not, please go to the boto3 docs to know how to create a bucket as well as upload files.
  3. The paragraph_constructor is an ad hoc function for my use case. You may have to adapt it based on the space between lines in your data.

UPCOMING FEATURES:

  • Address abstract cases with the paragrpah_constructor function.
  • Export data in different formats.
  • AWS CloudFormation template for a serverless architecture to execute the functions when a new object is uploaded in your S3 bucket.

Please feel free to suggest new features or improvements to the current code. <3

Owner
Juan Anzola
Juan Anzola
A Discord bot written in Python that can be used to control event management on a server.

Event Management Discord Bot A Discord bot written in Python that can be used to control event management on a Discord server. Made originally for GDS

Suvaditya Mukherjee 2 Dec 07, 2021
Photogrammetry Web API

OpenScanCloud Photogrammetry Web API Overview / Outline: The OpenScan Cloud is intended to be a decentralized, open and free photogrammetry web API. T

Thomas 86 Jan 05, 2023
A python to scratch API connector. Can fetch data from the API and send it back in cloud variables.

Scratch2py Scratch2py or S2py is a easy to use, versatile tool to communicate with the Scratch API Based of scratchclient by Raihan142857 Installation

20 Jun 18, 2022
The Sue Gray Alert System was a 5 minute project that just beeps every time a new article is updated or published on Gov.UK's news pages.

The Sue Gray Alert System was a 5 minute project that just beeps every time a new article is updated or published on Gov.UK's news pages.

Dafydd 1 Jan 31, 2022
Python bindings for LibreTranslate

Python bindings for LibreTranslate

Argos Open Tech 42 Jan 03, 2023
A script to automatically update bot status at GitHub as well as in Telegram channel.

Support BotStatus ~ A simple & short repository to show your bot's status in your GitHub README.md file as well as in you channel. ⚠️ This repo should

Jainam Oswal 55 Dec 13, 2022
A tiktok mass account creator with undetected selenium and email verification, to bot an account

⚠️ STILL UNDER DEVELOPEMENT - v1.1-beta ⚠️ Adding PROXY ROTATION Adding EMAIL VERIFICATION Adding USERNAME COMPILER Tiktok Mass Bot Creator v1.1-beta

xtekky 11 Aug 01, 2022
Automated endpoint management for Amazon Aurora Global Database

This sample code can be used to manage Aurora global database endpoints. After failover the global database writer endpoints swap from one region to the other. This solution automates creation and ma

AWS Samples 13 Dec 08, 2022
SpaceManJax's open-source Discord Bot. Now on Github!

SpaceManBot This is SpaceManJax's open-source Discord.py Bot. Now on Github! This bot runs on Repl.it, which is a free online code editor. It can do a

Jack 1 Nov 16, 2021
Url-shortener - A url shortener made in python using the API's from the pyshorteners lib

URL Shortener Um encurtador de link feito em python usando as API's da lib pysho

Spyware 3 Jan 07, 2022
Hellomotoot - PSTN Mastodon Client using Mastodon.py and the Twilio API

Hello MoToot PSTN Mastodon Client using Mastodon.py and the Twilio API. Allows f

Lorenz Diener 9 Nov 22, 2022
This bot will send you an email or notify you via telegram & discord if dolar/lira parity breaks a record.

Dolar Rekor Kırdı Mı? This bot will send you an email or notify you via Telegram & Discord if Dolar/Lira parity breaks a record. Mailgun can be used a

Yiğit Göktuğ Budanur 2 Oct 14, 2021
Balanced API library in python.

Balanced Online Marketplace Payments v1.x requires Balanced API 1.1. Use v0.x for Balanced API 1.0. Installation pip install balanced Usage View Bala

Balanced 70 Oct 04, 2022
This tool helps users selecting items from the Gwennen gambling trade (based on prices of the uniques).

Gwennen Gambler This small program will check each item in the Gwennen shop (item gamble) according and show small stats according to poe.ninja. Shoul

9 Apr 10, 2022
A secure and customizable bot for controlling cross-server announcements and interactions within Discord

DiscordBot A secure and customizable bot for controlling cross-server announcements and interactions within Discord. Within the code of the bot, you c

Jacob Dorfmeister 1 Jan 22, 2022
This Bot Can Upload Video from Link Of Pdisk to Pdisk using its API. @PredatorHackerzZ

𝐏𝐝𝐢𝐬𝐤 𝐂𝐨𝐧𝐯𝐞𝐫𝐭𝐞𝐫 𝐁𝐨𝐭 Make short link by using 𝐏𝐝𝐢𝐬𝐤 API key Installation 𝐓𝐡𝐞 𝐄𝐚𝐬𝐲 𝐖𝐚𝐲 𝐑𝐞𝐪𝐮𝐢𝐫𝐞𝐝 𝐕𝐚𝐫𝐢𝐚𝐛𝐥𝐞

ρяє∂αтσя 25 Dec 02, 2022
A google search telegram bot.

Google-Search-Bot A google search telegram bot. Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https://github.c

Fayas Noushad 37 Nov 24, 2022
An unofficial API for lyricsfreak.com using django and django rest framework.

An unofficial API for lyricsfreak.com using django and django rest framework.

Hesam Norin 1 Feb 09, 2022
A simple use library for bot discord.py developers

Discord Bot Template It's a simple use library for bot discord.py developers. Ob

Tir Omar 0 Oct 16, 2022
Trading bot rienforcement with python

Trading_bot_rienforcement System: Ubuntu 16.04 GPU (GeForce GTX 1080 Ti) Instructions: In order to run the code: Make sure to clone the stable baselin

1 Oct 22, 2021