This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Overview

AWS Data Engineering Pipeline

This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda):

drawing

Below are the steps of how to build this pipeline in AWS:

1️⃣ Create a new iCloud9 environment dedicated to this project.

🤔 Need a refresher? Please check this repo.

⚠️ Make sure to use name as your unique id for your items in the fang table.

2️⃣ Create a fang table in DynamoDB and SQS queue.

You can check how to do it here.

3️⃣ Build producer Lambda Function

  1. In iCloud9, initialize a serverless application with SAM template:

    sam init 

Inputs: 1, 2, 4, "producer"

  1. Set virtual environment and source it:

    # I called my virtual environment "comprehendProducer"
    python3 -m venv ~/.comprehendProducer
    source ~/.comprehendProducer/bin/activate
  2. Add the code for your application to app.py

  3. Add relevant packages used in your app to requirements.txt file

  4. Install requirements

     cd hello_world/
     pip install -r requirements.txt 
     cd .. 
  5. Create a repository (producer) in Elastic Container Registry (ECR) and copy its URI

  6. Build and deploy your serverless application:

    sam build 
    sam deploy --guided

    When prompted to input URI, paste the URI for the producer repository that you've just created.

  7. Create IAM Role granting Administrator Access to the Producer Lambda function.

    🤔 Not sure how to create IAM Role? Check out this video (17 min ).

  8. Add the execution role that you created to the Producer Lambda function.

    In case you forgot how to do it:

    In AWS console: Lambda ➡️ click on producer function ➡️ configuration ➡️ permissions ➡️ Edit ➡️ Select the role under Existing role.

  9. You are all set with the producer function! Now deactivate virtual environment:

    deactivate 
    cd .. 
    

4️⃣ Create an S3 bucket and note its name

5️⃣ Build consumer Lambda Function

Repeat steps in 3️⃣ .

⚠️ In #3 when you add the code for a consumer app to app.py, make sure to replace bucket="fangsentiment" with the name of your S3 bucket.

6️⃣ Add triggers to Lambda Functions

🤔 Not sure how to do it? Check out this video (start times are noted below):

Producer Lambda Function: CloudWatchEvent(30 min)

Consumer Lambda Function: SQS (42 min)

7️⃣ If all goes well, you will see sentiment results in your S3 bucket:

s3

💡 Tip: If you've already deployed your Lambda function but need to edit your application, you can make the necessary edits to your app and build and deploy the app again:

sam build && sam deploy 

💡 Tip: If you don't have space left on disk, you may want to remove a few docker containers that you don't use.

#list containers 
docker image ls 
# remove a container 
docker image rm <containerId>
API RestFull web de pontos turisticos de certa região

##RESTful Web API para exposição de pontos turísticos de uma região## Propor um novo ponto turístico Moderação dos pontos turísticos cadastrados Lista

Lucas Silva 2 Jan 28, 2022
Signs API calls to SberCloud.Advanced with AK/SK

sbercloud-api-aksk Signs API calls to SberCloud.Advanced with AK/SK This script is a courtesy of @sadpdtchr Description Sometimes there is a need to m

Peter Predtechensky 1 Nov 30, 2021
a translator bot for discord

TranslatorBOT it is a simple and powerful discord bot, it been used for translating includes more than 100 language, it has a lot of integrated comman

Mear. 2 Feb 03, 2022
• Create Your Own YouTube Info Api.

youtube_data_api • Create Your Own YouTube Info Api. Deploy How to Use https://{ Heroku App Name }.herokuapp.com/api?link={YouTube link} In local Host

lokaman chendekar 12 Oct 02, 2022
arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaplex Candy Machine.

arweave-nft-uploader arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaple

0xEnrico 84 Dec 26, 2022
Auto-Approved-Bot - Auto Approved Invaite Link Request Telegram Bot

🤖 𝗔𝘂𝘁𝗼-𝗔𝗽𝗽𝗿𝗼𝘃𝗲-𝗕𝗼𝘁 🤖 ℹ️ 𝗨𝘀𝗲𝗴𝗲 ℹ️ When a join request invita

Muhammed 32 Dec 18, 2022
This repository contains unofficial code reproducing Agent57

Agent57 This repository contains unofficial code reproducing Agent57, which outp

19 Dec 29, 2022
Um simples bot escrito em Python usando a lib pyTelegramBotAPI

Telegram Bot Python Um simples bot escrito em Python usando a lib pyTelegramBotAPI Instalação Windows: Download do Python 3 Aqui Download do ZIP do Có

Sr_Yuu 1 May 07, 2022
A python script to acquire multiple aws ec2 instances in a forensically sound-ish way

acquire_ec2.py The script acquire_ec2.py is used to automatically acquire AWS EC2 instances. The script needs to be run on an EC2 instance in the same

Deutsche Telekom Security GmbH 31 Sep 10, 2022
May or may not be work🚶

AnyDLBot There are multiple things I can do: 👉 All Supported Video Formats of https://rg3.github.io/youtube-dl/supportedsites.html 👉 Upload as file

Arun 2 Nov 16, 2021
Bezlik Year Calendar Planner

Bezlik Year Calendar Planner Scribus script for creating year planners on one page A1 paper format. Script is based on Year-Calendar-Script-for-Scribu

Bohdan Bobrowski 2 May 24, 2022
A discord bot for checking what linked profiles a user has to their Ubisoft account

ubisoft_discord_profiles A Discord bot for checking what linked profiles a user has to their Ubisoft account. This can be setup using an enviromental

Andrei 1 Dec 17, 2021
Python client for using Prefect Cloud with Saturn Cloud

prefect-saturn prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tu

Saturn Cloud 15 Dec 07, 2022
Twitter bot that turns comment chains into ace attorney scenes. Inspired by and using https://github.com/micah5/ace-attorney-reddit-bot

Ace Attorney twitter Bot Twitter bot that turns comment chains into ace attorney scenes. Inspired by and using https://github.com/micah5/ace-attorney-

Luis Mayo Valbuena 542 Dec 17, 2022
Simple script to extract useful informations from the combo BloodHound + Neo4j

bloodhound-quickwin Simple script to extract useful informations from the combo BloodHound + Neo4j. Can help to choose a target. Prerequisites python3

140 Dec 21, 2022
Stream Telegram files to web

Telegram File Stream Bot A Telegram bot to stream files to web Demo Bot » Report a Bug | Request Feature Table of Contents About this Bot Original Rep

Wrench 572 Jan 09, 2023
An advanced Filter Bot with nearly unlimitted filters!

Unlimited Filter Bot ㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤ An advanced Filter Bot with nearly unlimitted filters! Features Nearly unlimited filters Supports all type of fil

TroJanzHEX 445 Jan 03, 2023
A community made discord bot coded in Python and running on AWS.

Pogbot Project Open Group Discord This is an open source community ran project. Join the discord for more information on how to participate. Coded in

Project Open Group 2 Jul 27, 2022
Wetterdienst - Open weather data for humans

We are a group of like-minded people trying to make access to weather data in Python feel like a warm summer breeze, similar to other projects like rdwd for the R language, which originally drew our

226 Jan 04, 2023
A Django-style ORM idea for manipulating Google Datastore entities

No SeiQueLa ORM EM DESENVOLVIMENTO Uma ideia de ORM no estilo do Django para manipular entidades do Google Datastore. Montando seu modelo: from noseiq

Geraldo Castro 16 Nov 01, 2022