This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Overview

AWS Data Engineering Pipeline

This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda):

drawing

Below are the steps of how to build this pipeline in AWS:

1️⃣ Create a new iCloud9 environment dedicated to this project.

🤔 Need a refresher? Please check this repo.

⚠️ Make sure to use name as your unique id for your items in the fang table.

2️⃣ Create a fang table in DynamoDB and SQS queue.

You can check how to do it here.

3️⃣ Build producer Lambda Function

  1. In iCloud9, initialize a serverless application with SAM template:

    sam init 

Inputs: 1, 2, 4, "producer"

  1. Set virtual environment and source it:

    # I called my virtual environment "comprehendProducer"
    python3 -m venv ~/.comprehendProducer
    source ~/.comprehendProducer/bin/activate
  2. Add the code for your application to app.py

  3. Add relevant packages used in your app to requirements.txt file

  4. Install requirements

     cd hello_world/
     pip install -r requirements.txt 
     cd .. 
  5. Create a repository (producer) in Elastic Container Registry (ECR) and copy its URI

  6. Build and deploy your serverless application:

    sam build 
    sam deploy --guided

    When prompted to input URI, paste the URI for the producer repository that you've just created.

  7. Create IAM Role granting Administrator Access to the Producer Lambda function.

    🤔 Not sure how to create IAM Role? Check out this video (17 min ).

  8. Add the execution role that you created to the Producer Lambda function.

    In case you forgot how to do it:

    In AWS console: Lambda ➡️ click on producer function ➡️ configuration ➡️ permissions ➡️ Edit ➡️ Select the role under Existing role.

  9. You are all set with the producer function! Now deactivate virtual environment:

    deactivate 
    cd .. 
    

4️⃣ Create an S3 bucket and note its name

5️⃣ Build consumer Lambda Function

Repeat steps in 3️⃣ .

⚠️ In #3 when you add the code for a consumer app to app.py, make sure to replace bucket="fangsentiment" with the name of your S3 bucket.

6️⃣ Add triggers to Lambda Functions

🤔 Not sure how to do it? Check out this video (start times are noted below):

Producer Lambda Function: CloudWatchEvent(30 min)

Consumer Lambda Function: SQS (42 min)

7️⃣ If all goes well, you will see sentiment results in your S3 bucket:

s3

💡 Tip: If you've already deployed your Lambda function but need to edit your application, you can make the necessary edits to your app and build and deploy the app again:

sam build && sam deploy 

💡 Tip: If you don't have space left on disk, you may want to remove a few docker containers that you don't use.

#list containers 
docker image ls 
# remove a container 
docker image rm <containerId>
Grape - A webbrowser with its own Search Engine

Grape 🔎 A Web Browser made entirely in python. Search Engine 🔎 Installation: F

Grape 2 Sep 06, 2022
A pyrogram simple bot for Educational purpose.

A pyrogram simple bot for Educational purpose. To Learn More check at @PyrogramBot or on Documentation Mandatory variables API_ID - Get It From my.tel

SpamShield 10 Dec 06, 2022
Track live sentiment for stocks from Reddit and Twitter and identify growing stocks

Market Sentiment About This repository can mainly be used for two things. a. Tracking the live sentiment of stocks from Reddit and Twitter b. Tracking

Market Sentiment 345 Dec 17, 2022
Twitch Points Miner for multiple accounts with Discord logging

Twitch Points Miner for multiple accounts with Discord logging Creator of the Twitch Miner -- PLEASE NOTE THIS IS PROBABLY BANNABLE -- Made on python

8 Apr 27, 2022
Code to help me strengthen my bot army

discord-bot-manager an api to help you manage your other bots auth lazy: using the browser dev tools, capture a post call and view the Authorization h

Riley Snyder 2 Mar 18, 2022
A simple bot discord in PY with moderation controls

Voila un bot discord en py avec les commandes simples de modération tout simplement faut changer les lignes 70 vous mettez votre token de votre bot 53

Ethan 1 Nov 20, 2021
An almost dependency-less, synchronous Discord gateway library meant for my personal use

An almost dependency-less, synchronous Discord gateway library meant for my personal use.

h0nda 4 Feb 05, 2022
The Python SDK for the BattleshAPI game

BattleshAPy The Python SDK for the BattleshAPI game Installation This library will be eventually going on PyPI, but until then, simply clone or downlo

Christopher 0 Apr 18, 2022
A collection of tools for managing Jira issues for the RHODS project

RHODS-Jira-Tools A collection of tools for managing Jira issues for the RHODS project move_to_qa.py This script handles transitioning a given Jira iss

Alex Corvin 1 Sep 20, 2022
It's My Bot, For my group in telegram :)

Get Start USage This robot is written in Python language for devdood Group in Telegram ... You can easily edit and use this source Edit and Run You ne

Mohsen farzadmanesh 7 Sep 24, 2022
AWS-serverless-starter - AWS Lambda serverless stack via Serverless framework

Serverless app via AWS Lambda, ApiGateway and Serverless framework Configuration

Bəxtiyar 3 Feb 02, 2022
Deleting someone else's Instagram account, repeat until the target account is blocked.

Program Features 📌 Instagram report V4. 📌 Coded with the latest version of Python. 📌 Has automatic scheduling. 📌 Full account report. 📌 Report a

hack4lx 16 Oct 25, 2022
Python script for download course from platzi.com

Platzi Downloader Tool Esta es una pequeña herramienta que hace mucho y que te ahorra una gran cantidad de trabajo a la hora de descargar cursos de Pl

Devil64-Dev 21 Sep 22, 2022
Projeto de teste para acesso a API SWAPI.

SwapiTest Projeto de teste para acesso a API Swapi com informações sobre Star Wars. Como rodar o programa Foi utilizado o pipenv, então basta clonar o

Gabriel de Souza Alves 1 Nov 23, 2021
Código para trabalho com o dataset Wine em Python

Um perceptron multicamadas (MLP) é uma rede neural artificial feedforward que gera um conjunto de saídas a partir de um conjunto de entradas. Um MLP é

Hemili Beatriz 1 Jan 08, 2022
pylunasvg - Python bindings for lunasvg

pylunasvg - Python bindings for lunasvg Pylunasvg is a simple wrapper around lunasvg that uses pybind11 to create python bindings. All public API of t

Eren 6 Jan 05, 2023
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 08, 2023
Simple Discord bot which logs several events in your server

logging-bot Simple Discord bot which logs several events in your server, including: Message Edits Message Deletes Role Adds Role Removes Member joins

1 Feb 14, 2022
A Simple Telegram Bot That Can Generate Strong Password With Many Features Written In Python Using Pyrogram

Password-Generator-Bot A Simple Telegram Bot That Can Generate Strong Password With Many Features Written In Python Using Pyrogram Features Random Pas

Muhammed Fazin 17 Dec 23, 2022
The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards.

Shweta_kumawat 2 Jan 20, 2022