This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Overview

AWS Data Engineering Pipeline

This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda):

drawing

Below are the steps of how to build this pipeline in AWS:

1️⃣ Create a new iCloud9 environment dedicated to this project.

🤔 Need a refresher? Please check this repo.

⚠️ Make sure to use name as your unique id for your items in the fang table.

2️⃣ Create a fang table in DynamoDB and SQS queue.

You can check how to do it here.

3️⃣ Build producer Lambda Function

  1. In iCloud9, initialize a serverless application with SAM template:

    sam init 

Inputs: 1, 2, 4, "producer"

  1. Set virtual environment and source it:

    # I called my virtual environment "comprehendProducer"
    python3 -m venv ~/.comprehendProducer
    source ~/.comprehendProducer/bin/activate
  2. Add the code for your application to app.py

  3. Add relevant packages used in your app to requirements.txt file

  4. Install requirements

     cd hello_world/
     pip install -r requirements.txt 
     cd .. 
  5. Create a repository (producer) in Elastic Container Registry (ECR) and copy its URI

  6. Build and deploy your serverless application:

    sam build 
    sam deploy --guided

    When prompted to input URI, paste the URI for the producer repository that you've just created.

  7. Create IAM Role granting Administrator Access to the Producer Lambda function.

    🤔 Not sure how to create IAM Role? Check out this video (17 min ).

  8. Add the execution role that you created to the Producer Lambda function.

    In case you forgot how to do it:

    In AWS console: Lambda ➡️ click on producer function ➡️ configuration ➡️ permissions ➡️ Edit ➡️ Select the role under Existing role.

  9. You are all set with the producer function! Now deactivate virtual environment:

    deactivate 
    cd .. 
    

4️⃣ Create an S3 bucket and note its name

5️⃣ Build consumer Lambda Function

Repeat steps in 3️⃣ .

⚠️ In #3 when you add the code for a consumer app to app.py, make sure to replace bucket="fangsentiment" with the name of your S3 bucket.

6️⃣ Add triggers to Lambda Functions

🤔 Not sure how to do it? Check out this video (start times are noted below):

Producer Lambda Function: CloudWatchEvent(30 min)

Consumer Lambda Function: SQS (42 min)

7️⃣ If all goes well, you will see sentiment results in your S3 bucket:

s3

💡 Tip: If you've already deployed your Lambda function but need to edit your application, you can make the necessary edits to your app and build and deploy the app again:

sam build && sam deploy 

💡 Tip: If you don't have space left on disk, you may want to remove a few docker containers that you don't use.

#list containers 
docker image ls 
# remove a container 
docker image rm <containerId>
Algofi Python SDK is useful for developers who want to programatically interact with the Algofi lending protocol

algofi-py-sdk Algofi Python SDK Documentation https://algofi-py-sdk.readthedocs.

Algofi 41 Dec 15, 2022
Simple Craigslist wrapper

python-craigslist A simple Craigslist wrapper. License: MIT-Zero. Disclaimer I don't work for or have any affiliation with Craigslist. This module was

Julio M. Alegria 370 Dec 22, 2022
A Telegram bot for Download songs in mp3 format from YouTube and Extract lyrics from Genius.com ❤️

MeudsaMusic A Telegram bot for Download songs in mp3 format from YouTube and Extract lyrics from Genius.com ❤️ Commands Reach @MedusaMusic on Telegram

Bibee 14 Oct 06, 2022
SOCMINT tool to get personal infos from an Instagram account via analysis of its followers and/or following

S T E R R A 🔭 A SOCMINT tool to get infos from an Instagram acc via its Followers / Following Allows you to analyse someone's followers, following, a

aet 316 Dec 28, 2022
Demonstrate how GitHub OIDC token getting should be included in boto3

boto3 should add direct support for AssumeRoleWithWebIdentity for GitHub Actions There is a aws-actions/configure-aws-credentials action that will get

Ben Kehoe 11 Aug 29, 2022
Simple, yet effective moderator bot for telegram. With reports, logs, profanity filter and more :3

👹 Samurai Telegram Bot Simple, yet effective moderator bot for telegram. With reports, logs, profanity filter and more :3 Description Personal bot, m

Abraham Tugalov 106 Dec 13, 2022
Experimental bridges between Telegram calls and other platforms.

Bridges by Calls Music Experimental bridges between Telegram calls and other platforms. Current bridges Bridge 1 (YouTube, Twitch, Facebook, etc...) B

Calls Music 14 Oct 08, 2022
Drop-in Replacement of pychallonge

pychal Pychal is a drop-in replacement of pychallonge with some extra features and support for new Python versions. Pychal provides python bindings fo

ZED 29 Nov 28, 2022
A simple anti-ghostping python bot made using diskord.

Anti Ghostping A simple Anti-Ghostping python bot made with ❤ using Diskord Requirements No one will use this but, all you need for this bot is: Pytho

RyZe 2 Sep 12, 2022
discord.xp Bot, counts XP for members

discord.xp Bot, counts XP for members. How to setup and run? You must have an mysql database Download libs from the requirements.txt file Configurize

irwing 4 Feb 05, 2022
Simple VK API wrapper for Python

VK Admier: documentation VK Admier is simple VK API wrapper for community bot development. Authorization You should create bot object from Client clas

Egor Light 2 Nov 10, 2022
Ulaavi for nuke, helps to keep our stocl elements organised.

Ulaavi Ulaavi for nuke, helps to keep our stock elements organised. Installation Downlaod ffmpeg from ffmpeg.org linux : https://johnvansickle.com/ffm

Arun Subramaniyam 17 Aug 24, 2022
Notion API Database Python Implementation

Python Notion Database Notion API Database Python Implementation created only by database from the official Notion API. Installing / Getting started p

minwook 78 Dec 19, 2022
A Python script for rendering glTF files with V-Ray App SDK

V-Ray glTF viewer Overview The V-Ray glTF viewer is a set of Python scripts for the V-Ray App SDK that allow the parsing and rendering of glTF (.gltf

Chaos 24 Dec 05, 2022
vk.com API python wrapper

Python vk.com API wrapper This is a vk.com (the largest Russian social network) python API wrapper. The goal is to support all API methods (current an

Dmitry Voronin 371 Dec 29, 2022
Telegram Group Chat Statistics With Python

Telegram Group Chat Statistics How to Run First add PYTHONPATH in repository root directory enviroment variable by running: export PYTHONPATH=${PWD}

Sina Nazem 3 Apr 18, 2022
Python SCript to scrape members from a selected Telegram group.

A python script to scrape all the members in a telegram group anad save in a CSV file. REGESTRING Go to this link https://core.telegram.org/api/obtain

Gurjeet Singh 7 Dec 01, 2022
Github repository started notify 💕

Github repository started notify 💕

4 Aug 06, 2022
Simple tool to gather domains from crt.sh using the organization name

Domain Collector: _ _ ___ _ _ _ __| | ___ _ __ ___ __ _(_)_ __ / __\___ | |

Cyber Guy 63 Dec 24, 2022
A discord bot that send SMS spam!

bruh-bot send spam sms! send spam with email! it sends you spam via sms and Email using two tools, quack and impulse! if you have some problem contact

pai 32 Dec 25, 2022