Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

Overview

Public Datasets Pipelines

Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program.

Overview

public-datasets-pipelines

Requirements

Environment Setup

We use Pipenv to make environment setup more deterministic and uniform across different machines.

If you haven't done so, install Pipenv using the instructions found here. Now with Pipenv installed, run the following command:

pipenv install --ignore-pipfile --dev

This uses the Pipfile.lock found in the project root and installs all the development dependencies.

Finally, initialize the Airflow database:

pipenv run airflow initdb

Building Data Pipelines

Configuring, generating, and deploying data pipelines in a programmatic, standardized, and scalable way is the main purpose of this repository.

Follow the steps below to build a data pipeline for your dataset:

1. Create a folder hierarchy for your pipeline

mkdir -p datasets/DATASET/PIPELINE

[example]
datasets/covid19_tracking/national_testing_and_outcomes

where DATASET is the dataset name or category that your pipeline belongs to, and PIPELINE is your pipeline's name.

For examples of pipeline names, see these pipeline folders in the repo.

Use only underscores and alpha-numeric characters for the names.

2. Write your config (YAML) files

If you created a new dataset directory above, you need to create a datasets/DATASET/dataset.yaml config file. See this section for the dataset.yaml reference.

Create a datasets/DATASET/PIPELINE/pipeline.yaml config file for your pipeline. See this section for the pipeline.yaml reference.

If you'd like to get started faster, you can inspect config files that already exist in the repository and infer the patterns from there:

Every YAML file supports a resources block. To use this, identify what Google Cloud resources need to be provisioned for your pipelines. Some examples are

  • BigQuery datasets and tables to store final, customer-facing data
  • GCS bucket to store intermediate, midstream data.
  • GCS bucket to store final, downstream, customer-facing data
  • Sometimes, for very large datasets, you might need to provision a Dataflow job

3. Generate Terraform files and actuate GCP resources

Run the following command from the project root:

$ python scripts/generate_terraform.py \
    --dataset DATASET_DIR_NAME \
    --gcp-project-id GCP_PROJECT_ID \
    --region REGION \
    --bucket-name-prefix UNIQUE_BUCKET_PREFIX \
    [--env] dev \
    [--tf-apply] \
    [--impersonating-acct] IMPERSONATING_SERVICE_ACCT

This generates Terraform files (*.tf) in a _terraform directory inside that dataset. The files contain instrastructure-as-code on which GCP resources need to be actuated for use by the pipelines. If you passed in the --tf-apply parameter, the command will also run terraform apply to actuate those resources.

The --bucket-name-prefix is used to ensure that the buckets created by different environments and contributors are kept unique. This is to satisfy the rule where bucket names must be globally unique across all of GCS. Use hyphenated names (some-prefix-123) instead of snakecase or underscores (some_prefix_123).

In addition, the command above creates a "dot" directory in the project root. The directory name is the value you pass to the --env parameter of the command. If no --env argument was passed, the value defaults to dev (which generates the .dev folder).

Consider this "dot" directory as your own dedicated space for prototyping. The files and variables created in that directory will use an isolated environment. All such directories are gitignored.

As a concrete example, the unit tests use a temporary .test directory as their environment.

4. Generate DAGs and container images

Run the following command from the project root:

$ python scripts/generate_dag.py \
    --dataset DATASET_DIR \
    --pipeline PIPELINE_DIR \
    [--skip-builds] \
    [--env] dev

This generates a Python file that represents the DAG (directed acyclic graph) for the pipeline (the dot dir also gets a copy). To standardize DAG files, the resulting Python code is based entirely out of the contents in the pipeline.yaml config file.

Using KubernetesPodOperator requires having a container image available for use. The command above allows this architecture to build and push it to Google Container Registry on your behalf. Follow the steps below to prepare your container image:

  1. Create an _images folder under your dataset folder if it doesn't exist.

  2. Inside the _images folder, create another folder and name it after what the image is expected to do, e.g. process_shapefiles, read_cdf_metadata.

  3. In that subfolder, create a Dockerfile and any scripts you need to process the data. See the samples/container folder for an example. Use the COPY command in your Dockerfile to include your scripts in the image.

The resulting file tree for a dataset that uses two container images may look like

datasets
└── DATASET
    ├── _images
    │   ├── container_a
    │   │   ├── Dockerfile
    │   │   ├── requirements.txt
    │   │   └── script.py
    │   └── container_b
    │       ├── Dockerfile
    │       ├── requirements.txt
    │       └── script.py
    ├── _terraform/
    ├── PIPELINE_A
    ├── PIPELINE_B
    ├── ...
    └── dataset.yaml

Docker images will be built and pushed to GCR by default whenever the command above is run. To skip building and pushing images, use the optional --skip-builds flag.

5. Declare and set your pipeline variables

Running the command in the previous step will parse your pipeline config and inform you about the templated variables that need to be set for your pipeline to run.

All variables used by a dataset must have their values set in

  [.dev|.test]/datasets/{DATASET}/{DATASET}_variables.json

Airflow variables use JSON dot notation to access the variable's value. For example, if you're using the following variables in your pipeline config:

  • {{ var.json.shared.composer_bucket }}
  • {{ var.json.parent.nested }}
  • {{ var.json.parent.another_nested }}

then your variables JSON file should look like this

{
  "shared": {
    "composer_bucket": "us-east4-test-pipelines-abcde1234-bucket"
  },
  "parent": {
    "nested": "some value",
    "another_nested": "another value"
  }
}

6. Deploy the DAGs and variables

Deploy the DAG and the variables to your own Cloud Composer environment using one of the two commands:

$ python scripts/deploy_dag.py \
  --dataset DATASET \
  --composer-env CLOUD_COMPOSER_ENVIRONMENT_NAME \
  --composer-bucket CLOUD_COMPOSER_BUCKET \
  --composer-region CLOUD_COMPOSER_REGION \
  --env ENV

Testing

Run the unit tests from the project root as follows:

$ pipenv run python -m pytest -v

YAML Config Reference

Every dataset and pipeline folder must contain a dataset.yaml and a pipeline.yaml configuration file, respectively:

Best Practices

  • When running scripts/generate_terraform.py, the argument --bucket-name-prefix helps prevent GCS bucket name collisions because bucket names must be globally unique. Use hyphens over underscores for the prefix and make it as unique as possible, and specific to your own environment or use case.

  • When naming BigQuery columns, always use snake_case and lowercase.

  • When specifying BigQuery schemas, be explicit and always include name, type and mode for every column. For column descriptions, derive it from the data source's definitions when available.

  • When provisioning resources for pipelines, a good rule-of-thumb is one bucket per dataset, where intermediate data used by various pipelines (under that dataset) are stored in distinct paths under the same bucket. For example:

    gs://covid19-tracking-project-intermediate
        /dev
            /preprocessed_tests_and_outcomes
            /preprocessed_vaccinations
        /staging
            /national_tests_and_outcomes
            /state_tests_and_outcomes
            /state_vaccinations
        /prod
            /national_tests_and_outcomes
            /state_tests_and_outcomes
            /state_vaccinations
    
    

    The "one bucket per dataset" rule prevents us from creating too many buckets for too many purposes. This also helps in discoverability and organization as we scale to thousands of datasets and pipelines.

    Quick note: If you can conveniently fit the data in memory, the data transforms are close-to-trivial and are computationally cheap, you may skip having to store mid-stream data. Just apply the transformations in one go, and store the final resulting data to their final destinations.

Comments
  • Feat: Onboard New york taxi trips dataset

    Feat: Onboard New york taxi trips dataset

    Description

    dataset: new_york_taxi_trips pipelines: tlc_green_trips, tlc_yellow_trips

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [ ] The Google Cloud Datasets team is aware of the proposed dataset
      • [ ] I put all my code inside datasets/new_york_taxi_trips> and nothing outside of that directory
    opened by nlarge-google 11
  • feature: Initial implementation for austin_311.311_service_requests

    feature: Initial implementation for austin_311.311_service_requests

    "Pipeline for austin_311.311_Service_Requests"

    Description

    v2 architecture implementation of 311_service_requests in austin, TX. This implements the first version of the csv transform python script.

    Based on #

    Note: It's recommended to open an issue first for context and discussion.

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [ ] Please merge this PR for me once it is approved.
    • [ ] If this PR adds or edits a feature, I have updated the README accordingly.
    • [ ] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [ ] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/<YOUR-DATASET> and nothing outside of that directory.
    • [ ] If this PR adds or edits a dataset or pipeline that I'm responsible for maintaining, my GitHub username is in the CONTRIBUTORS file.
    • [ ] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 11
  • feat: Onboard NOAA

    feat: Onboard NOAA

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/noaa and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    data onboarding cla: yes 
    opened by nlarge-google 7
  • feat: Onboard EPA historical air quality dataset

    feat: Onboard EPA historical air quality dataset

    Description

    Included: Annual summaries CO Daily Summary CO Hourly Summary HAP Daily Summary HAP Hourly Summary Lead Daily Summary NO2 Daily Summary NO2 Hourly Summary NONOxNOy Daily Summary NONOxNOy Hourly Summary Ozone Daily Summary Ozone Hourly Summary PM 10 Daily Summary PM10 Hourly Summary PM25 Frm Hourly Summary PM25 NonFrm Daily Summary PM25 NonFrm Hourly Summary PM25 Speciation Daily Summary PM25 Speciation Hourly Summary Pressure Daily Summary Pressure Hourly Summary RH and DP Daily Summary RH and DP Hourly Summary SO2 Daily Summary SO2 Hourly Summary Temperature Daily Summary Temperature Hourly Summary VOC Daily Summary VOC Hourly Summary Wind Daily Summary Wind Hourly Summary

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/epa_historical_air_quality and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 6
  • feat: Onboard San Francisco Bikeshare Trips

    feat: Onboard San Francisco Bikeshare Trips

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside san_francisco_bikeshare_trips/bikeshare_trips and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 6
  • feat: Onboard Census opportunity atlas tract outcomes

    feat: Onboard Census opportunity atlas tract outcomes

    Description

    Tract Outcomes

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/census_opportunity_atlas> and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    opened by nlarge-google 5
  • feat: Onboard Census Bureau International Dataset

    feat: Onboard Census Bureau International Dataset

    Description

    Based on #

    Note: It's recommended to open an issue first for context and discussion.

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.

    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.

    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/census_bureau_international and nothing outside of that directory.

    • [x] This PR is appropriately labeled.

    data onboarding cla: yes 
    opened by vasuc-google 5
  • Containerize custom tasks

    Containerize custom tasks

    Note: The following is taken from @tswast's recommendation on a separate thread.

    What are you trying to accomplish?

    One of the Airflow "gotchas" is that workers share resources with the scheduler, so any "real work" that uses CPU and/or memory can cause slowdowns in the scheduler or even instability if memory is used up.

    The recommendation is to do any "real work" in one of:

    What challenges are you running into?

    In the generated DAG, I see the following operator:

        # Run the custom/csv_transform.py script to process the raw CSV contents into a BigQuery friendly format
        process_raw_csv_file = bash_operator.BashOperator(
            task_id="process_raw_csv_file",
            bash_command="SOURCE_CSV=$airflow_home/data/$dataset/$pipeline/{{ ds }}/raw-data.csv TARGET_CSV=$airflow_home/data/$dataset/$pipeline/{{ ds }}/data.csv python $airflow_home/dags/$dataset/$pipeline/custom/csv_transform.py\n",
            env={'airflow_home': '{{ var.json.shared.airflow_home }}', 'dataset': 'covid19_tracking', 'pipeline': 'city_level_cases_and_deaths'},
        )
    

    I haven't looked closely at the csv_transform.py script yet, but I'd expect it to use non-trivial CPU / memory resources.

    For custom Python scripts such as this, I'd expect us to use the KubernetesPodOperator, where the work is scheduled on a separate node pool.

    Checklist

    • [x] I created this issue in accordance with the Code of Conduct.
    • [x] This issue is appropriately labeled.
    feature request 
    opened by adlersantos 5
  • Feat: Onboard Mimiciii dataset

    Feat: Onboard Mimiciii dataset

    Description

    This is to onboard mimiciii dataset with 25 pipelines using Airflow v2 operators only.

    Checklist

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved

    Use the sections below based on what's applicable to your PR and delete the rest:

    Feature

    • [ ] I'm adding or editing a feature
    • [ ] I have updated the README accordingly
    • [ ] I have added/revised tests for the feature

    Data Onboarding

    • [x] I'm adding or editing a dataset
    • [x] The Google Cloud Datasets team is aware of the proposed dataset
    • [x] I put all my code inside datasets/mimiciii and nothing outside of that directory

    Code cleanup or refactoring

    • [x] I'm refactoring or cleaning up some code
    data onboarding 
    opened by Naveen130 4
  • Refactor: Combine New York pipelines into one

    Refactor: Combine New York pipelines into one

    Description

    These are changes and clean-up to the existing dataset pipelines for new-york

    311_service_requests citibike_stations nypd_mv_collisions

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [x] The Google Cloud Datasets team is aware of the proposed dataset
      • [x] I put all my code inside datasets/new_york and nothing outside of that directory
    • [x] I'm refactoring or cleaning up some code
    opened by nlarge-google 4
  • Feat: Onboard SEC Failure to Deliver dataset

    Feat: Onboard SEC Failure to Deliver dataset

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [x] The Google Cloud Datasets team is aware of the proposed dataset
      • [x] I put all my code inside datasets/sec_failure_to_deliver and nothing outside of that directory pipelines/tree/main/tests) folder)
    • [x] I'm refactoring or cleaning up some code
    opened by nlarge-google 4
  • chore(deps): update dependency black to v22.12.0

    chore(deps): update dependency black to v22.12.0

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | black (changelog) | ==22.10.0 -> ==22.12.0 | age | adoption | passing | confidence |


    Release Notes

    psf/black

    v22.12.0

    Compare Source

    Preview style
    • Enforce empty lines before classes and functions with sticky leading comments (#​3302)
    • Reformat empty and whitespace-only files as either an empty file (if no newline is present) or as a single newline character (if a newline is present) (#​3348)
    • Implicitly concatenated strings used as function args are now wrapped inside parentheses (#​3307)
    • Correctly handle trailing commas that are inside a line's leading non-nested parens (#​3370)
    Configuration
    • Fix incorrectly applied .gitignore rules by considering the .gitignore location and the relative path to the target file (#​3338)
    • Fix incorrectly ignoring .gitignore presence when more than one source directory is specified (#​3336)
    Parser
    • Parsing support has been added for walruses inside generator expression that are passed as function args (for example, any(match := my_re.match(text) for text in texts)) (#​3327).
    Integrations
    • Vim plugin: Optionally allow using the system installation of Black via let g:black_use_virtualenv = 0(#​3309)

    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 0
  • Fix: Onboard HRRR processes in NOAA ETL

    Fix: Onboard HRRR processes in NOAA ETL

    Description

    Notes:

    • If you are adding or editing a dataset, please specify the dataset folder involved, e.g. datasets/google_trends.
    • If you are an external contributor, please contact the Google Cloud Datasets team for your proposed dataset or feature.
    • If you are adding or editing a dataset, please do it one dataset at a time. Have all the code changes inside a single datasets/noaa folder.
    opened by nlarge-google 0
  • chore(deps): update dependency pandas-gbq to v0.18.0

    chore(deps): update dependency pandas-gbq to v0.18.0

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | pandas-gbq | ==0.17.9 -> ==0.18.0 | age | adoption | passing | confidence |


    Release Notes

    googleapis/python-bigquery-pandas

    v0.18.0

    Compare Source

    Features
    • Map "if_exists" value to LoadJobConfig.WriteDisposition (#​583) (7389cd2)

    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 1
  • chore(deps): update dependency flake8 to v6

    chore(deps): update dependency flake8 to v6

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | flake8 (changelog) | ==5.0.4 -> ==6.0.0 | age | adoption | passing | confidence |


    Release Notes

    pycqa/flake8

    v6.0.0

    Compare Source


    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 0
  • Feat: Onboard Af dag notifications

    Feat: Onboard Af dag notifications

    Description

    Notes:

    • If you are adding or editing a dataset, please specify the dataset folder involved, e.g. datasets/google_trends.
    • If you are an external contributor, please contact the Google Cloud Datasets team for your proposed dataset or feature.
    • If you are adding or editing a dataset, please do it one dataset at a time. Have all the code changes inside a single datasets/af_dag_notifications folder.
    opened by nlarge-google 0
Releases(v5.2.0)
This is a open source discord bot project

pythonDiscordBot This is a open source discord bot project #based on the MAX A video: https://www.youtube.com/watch?v=jHZlvRr9KxM Prerequisites Python

Edson Holanda Teixeira Junior 3 Oct 11, 2021
QR-Code-Grabber - A python script that allows a person to create a qr code token grabber

Qr Code Grabber Description Un script python qui permet a une personne de creer

5 Jun 28, 2022
Cancel all your follow requests on Instagram.

Unrequester This python code unrequests all your follow requests on Instagram, using selenium. Everything's step-by-step and understanding it is like

ChamRun 3 Apr 09, 2022
A Python wrapper around the OpenWeatherMap web API

PyOWM A Python wrapper around OpenWeatherMap web APIs What is it? PyOWM is a client Python wrapper library for OpenWeatherMap (OWM) web APIs. It allow

Claudio Sparpaglione 740 Dec 18, 2022
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats.

VC UserBot A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Python

조던 1 Nov 29, 2021
A pypi package that helps in generating discord bots.

A pypi package that helps in generating discord bots.

KlevrHQ 3 Nov 17, 2021
Savecontentbot - Telegram Save Content Bot With Same more Features

Save Restricted Content Bot A simple telegram bot to save restricted content wit

Group Dc Bots 3 Jan 26, 2022
A simple Discord Bot that uses the free CryptoCompare API to display cryptocurrency prices

What is this? This is a simple Discord Bot coded in Python that uses the free CryptoCompare API to display cryptocurrency prices Download Use git to c

Kevin 10 Apr 17, 2022
SimpleDCABot is a simple bot that buys crypto with a dollar-cost averaging strategy.

Simple Open Dollar Cost Averaging (DCA) Bot SimpleDCABot is a simple bot that buys crypto on a selected exchange at regular intervals for a prescribed

4 Mar 28, 2022
Yes, it's true :purple_heart: This repository has 353 stars.

Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If

510 Dec 28, 2022
Repository for the Nexus Client software.

LinkScope Client Description This is the repository for the LinkScope Client Online Investigation software. LinkScope allows you to perform online inv

107 Dec 30, 2022
A simple, infinitely scalable, SQS based queue.

SimpleQ A simple, infinitely scalable, SQS based queue. Meta Author: Randall Degges Email: [emai

Randall Degges 162 Dec 21, 2022
Asyncevents: a small library to help developers perform asynchronous event handling in Python

asyncevents - Asynchronous event handling for modern Python asyncevents is a small library to help developers perform asynchronous event handling in m

Mattia 5 Aug 07, 2022
Ma2tl - macOS forensic timeline generator using the analysis result DBs of mac apt

ma2tl (mac_apt to timeline) This is a DFIR tool for generating a macOS forensic

Minoru Kobayashi 66 Nov 18, 2022
This project checks the weather in the next 12 hours and sends an SMS to your phone number if it's going to rain to remind you to take your umbrella.

RainAlert-Request-Twilio This project checks the weather in the next 12 hours and sends an SMS to your phone number if it's going to rain to remind yo

9 Apr 15, 2022
Sakura: an powerfull Autofilter bot that can be used in your groups

Sakura AutoFilter This Bot May Look Like Mwk_AutofilterBot And Its Because I Like Its UI, That's All Sakura is an powerfull Autofilter bot that can be

PaulWalker 12 Oct 21, 2022
Integrating Amazon API Gateway private endpoints with on-premises networks

Integrating Amazon API Gateway private endpoints with on-premises networks Read the blog about this application: Integrating Amazon API Gateway privat

AWS Samples 12 Sep 09, 2022
Python client for the LightOn Muse API

lightonmuse Python bindings to production-ready intelligence primitives powered by state-of-the-art language models. Create. Process. Understand. Lear

LightOn 12 Apr 10, 2022
Python wrapper for GitHub API v3

Squeezeit - Python CSS and Javascript minifier Copyright (C) 2011 Sam Rudge This program is free software: you can redistribute it and/or modify it un

David Medina 207 Oct 24, 2022
Weather Tracker, made with Python using Open Weather API

Weather Tracker Weather Tracker, made with Python using Open Weather API

Sahil Kumar 1 Feb 07, 2022