A tutorial presents several practical examples of how to build DAGs in Apache Airflow

Overview

Apache Airflow - Python Brasil 2021

Este tutorial apresenta vários exemplos práticos de como construir DAGs no Apache Airflow.

Background

Apache Airflow é uma das principais ferramentas de orquestração de workflows, onde você define as tarefas como Directed Acyclic Graphs (DAGs). O Airflow permite que você construa pipelines de dados escrevendo apenas códigos Python. Quando os workflows são definidos como código, eles se tornam manuteníveis, versionáveis, testáveis e colaborativos.

Rodando localmente com Pyenv

Você vai precisar de um ambiente virtual com python 3.6+ (recomendamos o 3.9).

Pyenv

Caso não tenha instalado na maquina, você pode usar o pyenv para ter multiplas versoes do python e criar seu ambiente virtual com ele. Siga a documentação oficial para instalar o pyenv na sua máquina:

Instale o Pyhton 3.9:

$ pip install 3.9.7
$ pyenv virtualenv 3.9.7 pybr-airflow
$ pyenv local pybr-airflow

Caso você não tenha o pip instalado, instale ele na sua máquina seguindo o tutorial abaixo:

Instalando o Airflow

Depois do ambiente virtual instalado, você vai precisar do apache-airflow e do apache-airflow-providers-docker instalados. Você pode fazer assim:

$ pip install apache-airflow apache-airflow-providers-docker

Depois você precisa configurar o airflow; para isso siga estes passos:

$ airflow db init
$ airflow users create --username=admin --firstname test --lastname test --role Admin --email [email protected]

Agora você pode rodar o airflow; para isso execute o seguinte comando:

$ airflow webserver -p 8081

Agora acesse a seguinte URL: http://localhost:8081.

Troubleshooting: Airflow não sendo reconhecido

Caso o comando do airflow não tiver sendo reconhecido, verifique se o ~/.local/bin na sua variável de ambiente PATH está configurada corretamente:

PATH=$PATH:~/.local/bin

Você também pode iniciar o Airflow com:

$ python -m airflow

Rodando localmente com Docker Compose

Pré-requisitos

Para rodar localmente é necessário, você atender aos seguintes pré-requisitos:

  • Instalar o Docker Community Edition (CE) na sua máquina (link de instalação aqui). É recomendável que sua máquina tenha ao menos 4GB de RAM livres.
  • Instalar o Docker Compose v1.29.1 ou alguma versão mais nova na sua máquina (link de instalação aqui).

Iniciar o ambiente

Para iniciar o ambiente, basta executar o comando abaixo:

make start-airflow

Destruir o ambiente

Para limpar o ambiente, basta executar o seguinte comando:

make reset-airflow

Owner
Jusbrasil
Jusbrasil
This is Gaurav's IP Project Completed in the year session of 2021-2022.

The Analyser by Gaurav Rayat Why this Project? Today we are continuously hearing about growth in Crime rates and the number of murders executed day by

1 Dec 30, 2021
1. 네이버 카페 댓글을 빨리 다는 기능

naver_autoprogram 기능 설명 네이버 카페 댓글을 빨리 다는 기능 네이버 카페 자동 출석 체크 기능 동작 방식 카페 댓글 기능 기본 동작은 주기적인 스케쥴 동작으로 해당 카페 ID 와 특정 API 주소로 대상이 새글을 작성했는지 체크. 해당 대상이 새글 등

1 Dec 22, 2021
Paprika is a python library that reduces boilerplate. Heavily inspired by Project Lombok.

Image courtesy of Anna Quaglia (Photographer) Paprika Paprika is a python library that reduces boilerplate. It is heavily inspired by Project Lombok.

Rayan Hatout 55 Dec 26, 2022
Python library for converting Python calculations into rendered latex.

Covert art by Joshua Hoiberg handcalcs: Python calculations in Jupyter, as though you wrote them by hand. handcalcs is a library to render Python calc

Connor Ferster 5.1k Jan 07, 2023
pyRTOS is a real-time operating system (RTOS), written in Python.

pyRTOS Introduction pyRTOS is a real-time operating system (RTOS), written in Python. The primary goal of pyRTOS is to provide a pure Python RTOS that

Ben Williams 96 Dec 30, 2022
The parser of a timetable of tennis matches for Flashscore website

FlashscoreParser The parser of a timetable of tennis matches for Flashscore website. The program collects the schedule of tennis matches for two days

Valendovsky 1 Jul 15, 2022
Async Python Circuit Breaker implementation

aiocircuitbreaker This is an async Python implementation of the circuitbreaker library. Installation The project is available on PyPI. Simply run: $ p

5 Sep 05, 2022
A fancy and practical functional tools

Funcy A collection of fancy functional tools focused on practicality. Inspired by clojure, underscore and my own abstractions. Keep reading to get an

Alexander Schepanovski 2.9k Dec 29, 2022
Chess bot can play automatically as white or black on lichess.com, chess.com and any website using drag and drop to move pieces

Chessbot "Why create another chessbot ?" The explanation is simple : I did not find a free bot I liked online : all the bots I saw on internet are par

Dhimas Bagus Prayoga 2 Nov 11, 2021
Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences! Posts are organized by country and destination within that country.

Christopher Zeas 1 Feb 03, 2022
Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutc

ASReview hackathon for Follow the Money 5 Dec 09, 2021
An assistant to guess your pip dependencies from your code, without using a requirements file.

Pip Sala Bim is an assistant to guess your pip dependencies from your code, without using a requirements file. Pip Sala Bim will tell you which packag

Collage Labs 15 Nov 19, 2022
GDSC UIET KUK 📍 , welcomes you all to this amazing event where you will be introduced to the world of coding 💻 .

GDSC UIET KUK 📍 , welcomes you all to this amazing event where you will be introduced to the world of coding 💻 .

Google Developer Student Club UIET KUK 9 Mar 24, 2022
A programming language that for tech savvy graphic designers

Microsoft Hackathon - PhoTex Idea A programming language that allows tech savvy graphic designers develop scalable vector graphics using plain text co

Joe Furfaro 5 Nov 14, 2021
Pymon is like nodemon but it is for python,

Pymon is like nodemon but it is for python,

Swaraj Puppalwar 2 Jun 11, 2022
An end-to-end Python-based Infrastructure as Code framework for network automation and orchestration.

Nectl An end-to-end Python-based Infrastructure as Code framework for network automation and orchestration. Features Data modelling and validation. Da

Adam Kirchberger 15 Oct 14, 2022
pydock - Docker-based environment manager for Python

pydock - Docker-based environment manager for Python ⚠️ pydock is still in beta mode, and very unstable. It is not recommended for anything serious. p

Alejandro Piad 16 Sep 18, 2021
Old versions of Deadcord that are problematic or used as reference.

⚠️ Unmaintained and broken. We have decided to release the old version of Deadcord before our v1.0 rewrite. (which will be equiped with much more feat

Galaxzy 1 Feb 10, 2022
Domoticz-hyundai-kia - Domoticz Hyundai-Kia plugin for Domoticz home automation system

Domoticz Hyundai-Kia plugin Author: Creasol https://www.creasol.it/domotics For

Creasol 7 Aug 03, 2022
Python script for diving image data to train test and val

dataset-division-to-train-val-test-python python script for dividing image data to train test and val If you have an image dataset in the following st

Muhammad Zeeshan 1 Nov 14, 2022