This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

Overview

airflow-test

This is a practice on Airflow, which is

First of all, build virtual env using virtualbox

This is optional but highly recommended

  • It is very important to make sure that development env is independent and has no conflict with other env setting

Next, make python virtual env on proper directory

python3 -m venv sandbox
source ~/sandbox/bin/activate

Let start to install Airflow and initial setting

Install Airflow

pip3 install apache-airflow==2.1.0 \ ## airflow version depends on when you install and constraint 
  --constraint 'git address' ## constraint is ???. It is optional. refer to https://gist.github.com/marclamberti

Airflow Inital Setting

airflow db init ## airflow metadata initializing
airflow users create -u admin -p admin -f jaeyoung -l jang -r Admin -e [email protected] ## create admin user

airflow webserver ## launch web server. after that we can access airflow web server through localhost:8080
airflow scheduler ## activate airfow scheduler to run DAGs

Construct data pipeline with Airflow

One example of data pipeline (one DAGs) - let follow below steps and make data pipeline

  • creating table -> is-api-available -> extending -> processing-use ->
  • what is it? when new user comes in, sense that and process and store user info to destination (?)
  1. create table
  • make python script user_processing.py in dags folder
  • let create table using sqlite operator in user_processing.py
from airflow.models import DAG
from airflow.providers.sqlite.operators.sqlite import SqliteOperator

from datetime import datetime

default_args = {
    'start_date': datetime(2020, 1, 1)
}

with DAG('user_processing', schedule_interval='@daily', default_args=default_args, catchup=False) as dag:
    # Define tasks/operators

    creating_talbe = SqliteOperator(
        task_id='creating_table',
        sqlite_conn_id='db_sqlite',
        sql='''
            CREATE TABLE users (
                firstname TEXT NOT NULL,
                lastname TEXT NOT NULL,
                country TEXT NOT NULL,
                username TEXT NOT NULL,
                password TEXT NOT NULL,
                email TEXT NOT NULL PRIMARY KEY
            );
            '''
    )
  • set sqlite connection on Airflow Web UI
pip install apache-airflow-providers-sqlite ## install airflow provider of sqlite. refer airflow doc. may already be installed from intial installing Airflow..
# next, make a new sqltie connection named 'db_sqlite' in Airflow
  • after that, do test airflow tasks!

airflow tasks test user_processing creating_table 2020-01-01

  • one more things to do just for check again
sqlite3 airflow.db ## connect to sqlite3 in airflow.db
sqlite>.tables; ## see the tables in airflow.dbb
sqlite>select * form users; ## test query
Owner
Jaeyoung
Jaeyoung
Minutaria is a basic educational Python timer used to learn python and software testing libraries.

minutaria minutaria is a basic educational Python timer. The project is educational, it aims to teach myself programming, python programming, python's

1 Jul 16, 2021
Python calculator made with tkinter package

Python-Calculator Python calculator made with tkinter package. works both on Visual Studio Code Or Any Other Ide Or You Just Copy paste The Same Thing

Pro_Gamer_711 1 Nov 11, 2021
Entitlement AND Hardened Runtime Check

Python3 script for macOS to recursively check /Applications and also check /usr/local/bin, /usr/bin, and /usr/sbin for binaries with problematic/interesting entitlements. Also checks for hardened run

Cedric Owens 79 Nov 16, 2022
In the works, creating a new Chess Board and way to Play...

sWJz4Chess date started on github.com 11-13-2021 In the works, creating a new Chess Board and way to Play... starting to write this in Pygame, any ind

Shawn 2 Nov 18, 2021
Test for using pyIIIFpres for rara magnetica project

raramagnetica_pyIIIFpres Test for using pyIIIFpres for rara magnetica project. This test show how to use pyIIIFpres for creating mannifest compliant t

Giacomo Marchioro 1 Dec 03, 2021
A collection of online resources to help you on your Tech journey.

Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di

Mohamed A 396 Dec 31, 2022
Code needed for hybrid land cover change analysis for NASA IDS project

Documentation for the NASA IDS change analysis Poley 10/21/2021 Required python packages: whitebox numpy rasterio rasterio.mask os glob math itertools

Andrew Poley 2 Nov 12, 2021
Script to use SysWhispers2 direct system calls from Cobalt Strike BOFs

SysWhispers2BOF Script to use SysWhispers2 direct system calls from Cobalt Strike BOFs. Introduction This script was initially created to fix specific

FalconForce 101 Dec 20, 2022
This is the key combo trainer for League of Legends and Dota 2 players.

This is the key combo trainer for League of Legends and Dota 2 players. Place the mouse cursor on the blue point and press the key combo from the upper-left side of the screen.

Ilya Shpigor 1 Jan 31, 2022
oracle arm registration script.

oracle_arm oracle arm registration script. 乌龟壳刷ARM脚本 本脚本优点 简单,主机配置好oci,然后下载main.tf即可,不用自己获取各种参数。 运行环境配置 本简单脚本使用python3编写,请自行配置好python3环境和requests库。(高版

test1234455 419 Jan 01, 2023
Project for viewing the cheapest flight deals from Netherlands to other countries.

Flight_Deals_AMS Project for viewing the cheapest flight deals from Netherlands to other countries.

2 Dec 17, 2022
PythonKafkaCompose is an upgrade of the amazing work done in liveMaps

PythonKafkaCompose is an upgrade of the amazing work done in liveMaps It is a simple project composed by: an instance of Kafka a Py

5 Jun 19, 2022
API moment - LussovAPI

LussovAPI TL;DR: py API container, pip install -r requirements.txt, example, main configuration Long version: Install Dependancies Download file requi

William Pedersen 1 Nov 30, 2021
Make dbt docs and Apache Superset talk to one another

dbt-superset-lineage Make dbt docs and Apache Superset talk to one another Why do I need something like this? Odds are rather high that you use dbt to

Slido 81 Jan 06, 2023
Delayed iteration for polling and retries.

Does Python need yet another retry / poll library? It needs at least one that isn't coupled to decorators and functions. Decorators prevent the caller

A. Coady 22 Dec 29, 2022
Drop-down terminal for GNOME

Guake 3 README Introduction Guake is a python based dropdown terminal made for the GNOME desktop environment. Guake's style of window is based on an F

Guake 4.1k Dec 25, 2022
Get a link to the web version of a git-tracked file or directory

githyperlink Get a link to the web version of a git-tracked file or directory. Applies to GitHub and GitLab remotes (and maybe others but those are no

Tomas Fiers 2 Nov 08, 2022
Make your functions return something meaningful, typed, and safe!

Make your functions return something meaningful, typed, and safe! Features Brings functional programming to Python land Provides a bunch of primitives

dry-python 2.5k Jan 03, 2023
PyToQlik is a library that allows you to integrate Qlik Desktop with Jupyter notebooks

PyToQlik is a library that allows you to integrate Qlik Desktop with Jupyter notebooks. With it you can: Open and edit a Qlik app inside a Ju

BIX Tecnologia 16 Sep 09, 2022
Python module for creating the circuit simulation definitions for Elmer FEM

elmer_circuitbuilder Python module for creating the circuit simulation definitions for Elmer FEM. The circuit definitions enable easy setup of coils (

5 Oct 03, 2022