AWS Blog post code for running feature-extraction on images using AWS Batch and Cloud Development Kit (CDK).

Overview

Batch processing with AWS Batch and CDK

Welcome

This repository demostrates provisioning the necessary infrastructure for running a job on AWS Batch using Cloud Development Kit (CDK). The AWS Batch job reads images from an S3 bucket, runs inference over image-to-vector computer vision model, and stores the results in DynamoDB. Code can be easily modified to fit other batch job transformations you might want to perform.

This code repository is part of the Deep learning image vector embeddings at scale using AWS Batch and CDK AWS DevOps Blog post.

Pre-requisites

  1. Create and source a Python virtualenv on MacOS and Linux, and install python dependencies:
$ python3 -m venv .env
$ source .env/bin/activate
$ pip install -r requirements.txt
  1. Install the latest version of the AWS CDK CLI:
$ npm i -g aws-cdk

Usage

Current code creates a the AWS Batch infrastructure, S3 Bucket for reading the data from, a DynamoDB table to write te batch operation results. Once the infrastructure is provisioned trough AWS CDK, you need to upload the images you want to process to the created S3 bucket. Once you've done that, go to the created AWS Lambda and submit a job. This will trigger a job execution on AWS Batch and you should see the results in the created DynamoDB table.

To deploy and run the batch inference, follow the following steps:

  1. Make sure you have AWS CDK installed and working, all the dependencies of this project defiend in the requirements.txt file, as well as having an installed and configured Docker in your environment;
  2. Set the CDK_DEPLOY_ACCOUNT ENV variable to the name of the AWS account you want to use (pre-defined with AWS CLI);
  3. Set the CDK_DEPLOY_REGION ENV variable to the name of the region you want to deploy the infra in (e.g. 'us-west-2');
  4. Run cdk deploy in the root of this project and wait for the deployment to finish successfully;
  5. Upload the images you need to proccess to the newly created S3 bucket under a S3 bucket path (e.g. /images). Use this path in the next step;
  6. Go to the created AWS Lambda and execute the lambda function with the following JSON:
{
"Paths": [
    "images"
   ]
}
  1. In the AWS console, go to AWS batch and make sure the jobs are submitted and are running successfully;
  2. Open the created DynamoDB table and validate the results are there;
  3. You can now use a DynamoDB client to read and consume the results;

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Data and a Twitter bot for the EPA's DOCUMERICA (1972-1977) program.

documerica This repository holds JSON(L) artifacts and a few scripts related to managing archival data from the EPA's DOCUMERICA program. Contents: Ma

William Woodruff 2 Oct 27, 2021
Build better AWS infrastructure

Sceptre About Sceptre is a tool to drive AWS CloudFormation. It automates the mundane, repetitive and error-prone tasks, enabling you to concentrate o

sceptre 1.4k Jan 04, 2023
A powerful bot to copy your google drive data to your team drive

⚛️ Clonebot - Heroku version ⚡ CloneBot is a telegram bot that allows you to copy folder/team drive to team drives. One of the main advantage of this

MsGsuite 269 Dec 23, 2022
Ethone-Selfbot - Open Source Discord Self-Bot, written in discord.py

Ethone SB Table of contents Newest open-source Discord SelfBot with useful commands and easy documentation on how to add your own and change the exist

Ethone 3 Jan 08, 2022
Secure Tunnel Manager

Making life easy of those who are in need of OpenSource alternative of AWS Secure Tunnel.

Suyash Chavan 1 Sep 27, 2022
A Discord Server Cloner With Lot Of New Features.

Technologies Screenshots Table of contents About Installation Links Deployed Features Website Score Contribution Need Help? Instagram Discord About A

NotSakshyam 25 Dec 31, 2022
Apps related to Odoo it's calendar features

calendar Apps related to Odoo it's calendar/appointments features: online_appointment_locations: allow setting an online URL per employee online_appoi

Yenthe Van Ginneken 3 Oct 27, 2022
Cleiton Leonel 4 Apr 22, 2022
A frame to create discord bots (for myself) that uses cogs, JSON, activities, and more.

dpy-frame A frame to create discord bots (for myself) that uses cogs, JSON, activities, and more. NOTE: Documentation is incomplete, so please wait un

Apple Discord 1 Nov 06, 2021
Stock Market Insights is a Dashboard that gives the 360 degree view of the particular company stock

fedora-easyfix A collection of self-contained and well-documented issues for newcomers to start contributing with How to setup the local development e

Ganesh N 3 Sep 10, 2021
Probably Overengineered Unimore Booker

POUB Probably Overengineered Unimore Booker A python-powered, actor-based, telegram-facing, timetable-aware booker for unimore (if you know more adjec

Lorenzo Rossi 3 Feb 20, 2022
A working selfbot for discord

React Selfbot Yes, for real ⚠ "Maintained" version: https://github.com/AquaSelfBot/AquaSelfbot ⚠ Why am I making this open source? Because can't stop

3 Jan 25, 2022
Discord music bot using discord.py, slash commands, and yt-dlp.

bop Discord music bot using discord.py, slash commands, and yt-dlp. Features Play music from YouTube videos and playlists Queue system with shuffle Sk

Hizkia Felix 3 Aug 11, 2022
yobot插件,Steam雷达,可自动播报玩家的Steam游戏状态和DOTA2图文战报

Steam_watcher 这是 prcbot/yobot 的自定义插件,可自动播报玩家的Steam游戏状态和DOTA2图文战报 都有些什么功能? 本插件可以在用户绑定后自动推送Steam游戏状态的更新和 Dota2 图文战报,以及提供一些手动查询功能 指令列表 atbot 表示需要@BOT ats

羽波 21 Jun 21, 2022
BingBot - A bot that will automate searches on bing

bingBot A bot that will automate searches on bing. To install this just download

Lukas 2 Jul 28, 2022
A Discord Self bot written in python

WitheredBot A Discord Self bot written in python Requirement Python = 3.9 How to Configure git clone https://github.com/a-a-a-aa/WitheredBot.git cd W

......... 0 Jan 05, 2023
An advanced automatic top.gg dank memer voter that votes automatically for you.

Auto Dank Memer Voter An automatic dank memer voter that sends votes onto top.gg every 12 hours, unless their is captcha. I am working on a captcha de

6 Aug 27, 2022
Decode the Ontario proof of vaccination QR code

Decode the contents of the Ontario Proof of Vaccination (the "Smart Health Card QR Code") Output This is from my QR code, hopefully fully redacted alt

Wesley Ellis 4 Oct 22, 2021
An iCal file to transport you to a new place every day until you die

everydayvirtualvacation An iCal file to transport you to a new place every day until you die The library is closed 😔 😔 including a video of the plac

Jacob Chapman 33 Apr 19, 2022
A simple Python TDLib wrapper

Telegram Forwarder App Description pywtdlib (Python Wrapper TDLib) is a simple synchronous Python wrapper that makes you easy to create new Python Tel

Álvaro Fernández 2 Jan 04, 2023