This repository provides a set functions to extract paragraphs from AWS Textract responses.

Overview

extract-paragraphs-with-aws-textract

Since AWS Textract (the AWS OCR service) does not have a native function to extract paragraphs, this repository provides a set of Python 3.X functions built on top of the AWS Python SDK (boto3) to extract paragraphs from AWS Textract responses.

PLEASE NOTE THAT:

  1. It is assumed that your client has the neccesary IAM permissions to access the different AWS resources required.
  2. Since AWS Textract analyze PDF files by running asynchronous operations, the current version assumes that you've already created an s3 bucket and that the PDF files are already stored there. If not, please go to the boto3 docs to know how to create a bucket as well as upload files.
  3. The paragraph_constructor is an ad hoc function for my use case. You may have to adapt it based on the space between lines in your data.

UPCOMING FEATURES:

  • Address abstract cases with the paragrpah_constructor function.
  • Export data in different formats.
  • AWS CloudFormation template for a serverless architecture to execute the functions when a new object is uploaded in your S3 bucket.

Please feel free to suggest new features or improvements to the current code. <3

Owner
Juan Anzola
Juan Anzola
A file-based quote bot written in Python

Let's Write a Python Quote Bot! This repository will get you started with building a quote bot in Python. It's meant to be used along with the Learnin

0 Jan 20, 2022
Discord bot to display private leaderboards for Advent of Code.

Advent Of Code Discord Bot Discord bot for displaying Advent of Code private leardboards, as well as custom leaderboards where participants can set th

The Future Gadgets Lab 6 Nov 29, 2022
A twitter bot that simply replies with a beautiful screenshot of the tweet, powered by beautify.dhravya.dev

Poet this! Replies with a beautiful screenshot of the tweet, powered by poet.so Installation git clone https://github.com/dhravya/poet-this.git cd po

Dhravya Shah 30 Dec 04, 2022
This is a bot which you can use in telegram to spam without flooding and enjoy being in the leaderboard

Telegram-Count-spamming-Bot This is a bot which you can use in telegram to spam without flooding and enjoy being in the leaderboard You can avoid the

Lalan Kumar 1 Oct 23, 2021
Generates a coverage badge using coverage.py and the shields.io service.

Welcome to README Coverage Badger ๐Ÿ‘‹ Generates a coverage badge using coverage.py and the shields.io service. Your README file is then updated with th

Victor Miti 10 Dec 06, 2022
Criando Lambda Functions para Ingerir Dados de APIs com AWS CDK

LIVE001 - AWS Lambda para Ingerir Dados de APIs Fazer o deploy de uma funรงรฃo lambda com infraestrutura como cรณdigo Lambda vai numa API externa e extra

Andre Sionek 12 Nov 20, 2022
Telegram Bot for everyday raffles

SpinEverydayBot v2 Telegram bot for everyday raffles. HIGHLY EXPERIMENTAL! WORK IN PROGRESS! Setting up Requirements Python 3.9+ PostgreSQL 13+ Older

evgfilim1 18 Dec 20, 2022
Sail is a free CLI tool to deploy, manage and scale WordPress applications in the DigitalOcean cloud.

Deploy WordPress to DigitalOcean with Sail Sail is a free CLI tool to deploy, manage and scale WordPress applications in the DigitalOcean cloud. Conte

Konstantin Kovshenin 159 Dec 12, 2022
A tool to customize your discord tokens

Fastest Discord Token Manager - Features: Change Token Username Change Token Password Change Token Avatar Change Token Bio This tool is created by Ace

trey 15 Dec 27, 2022
A youtube videos or channels tag finder python module

A youtube videos or channels tag finder python module

Fayas Noushad 4 Dec 03, 2021
Forked from 0x36 on github who then rewrote the ida_kernelcache python framework

Forked from 0x36 on github who then rewrote the ida_kernelcache python framework. Sadly 0x36 doesn't seem to have push updates to the project and it took me a very long time to figure out why this wa

Turnerhackz1 6 Dec 13, 2022
Slack bot to automatically delete yubisneeze / accidental yubikey presses

YubiSnooze Slack bot to automatically delete yubisneeze / accidental yubikey presses. It will search using the regex "[cbdefghijklnrtuv]{44}" and if t

Andrew MacPherson 3 Feb 09, 2022
C Y B ฮž R UserBot is a project that simplifies the use of Telegram.

C Y B ฮž R USฮžRBOT ๐Ÿ‡ฆ๐Ÿ‡ฟ C Y B ฮž R UserBot is a project that simplifies the use of Telegram. All rights reserved. Automatic Setup Android: open Termux p

FVREED 4 Dec 07, 2022
Migration Manager (MM) is a very small utility that can list source servers in a target account and apply mass launch template modifications.

Migration Manager Migration Manager (MM) is a very small utility that can list source servers in a target account and apply mass launch template modif

Cody 2 Nov 04, 2021
a list of disposable and temporary email address domains

List of disposable email domains This repo contains a list of disposable and temporary email address domains often used to register dummy users in ord

1.6k Jan 08, 2023
ARKHAM X GOD MULTISPAM BOT

ARKHAM-X-GOD-MULTISPAM-BOT ๐——๐—˜๐—ฃ๐—Ÿ๐—ข๐—ฌ ๐—จ๐—ฃ๐—ง๐—ข 30 ๐—•๐—ข๐—ง๐—ฆ ๐—œ๐—ก ๐—” ๐—ฆ๐—œ๐—ก๐—š๐—Ÿ?

ArkhamXGod 2 Jan 08, 2022
quote is a python wrapper for the Goodreads Quote API, powered by gazpacho.

About quote is a python wrapper for the Goodreads Quote API, powered by gazpacho.

Max Humber 11 Nov 10, 2022
SI_EXPLAINER_tg_bot: This bot is an assistant for medical professionals in interpreting the results of patient clustering.

SI_EXPLAINER_tg_bot This bot is an assistant for medical professionals in interpreting the results of patient clustering. ABOUT This chatbot was devel

Alexander Kanonirov 1 Jan 21, 2022
A minimal open source mtg-like tcg game made in python that can be played on a terminal emulator using a keyboard.

TCG-TERM Project state: ๐Ÿ”ง ๐Ÿšง ๐Ÿšง ๐Ÿšง Incomplete, In development ๐Ÿšง ๐Ÿšง ๐Ÿšง ๐Ÿ‘ท (Keep in mind that at the moment, This project is currently undone, and wil

Amos 3 Aug 29, 2021
A unified API wrapper for YouTube and Twitch chat bots.

Chatto A unified API wrapper for YouTube and Twitch chat bots. Contributing Chatto is open to contributions. To find out where to get started, have a

Ethan Henderson 5 Aug 01, 2022