A minimal caching proxy to GitHub's REST & GraphQL APIs

Overview

github-proxy

CircleCI Maintainability Test Coverage

A caching forward proxy to GitHub's REST and GraphQL APIs. GitHub-Proxy is a thin, highly extensible, highly configurable python framework based on Werkzeug. It comes with out-of-the-box support for Flask and Redis, but can be extended to integrate with other application frameworks, databases, and monitoring tools.

Features:

  • Caching of GitHub responses based on conditional requests.
  • Improved and granular monitoring of client usage and rate-limit consumption.
  • Provides a central and extensible pool of GitHub credentials (either GitHub Apps or user PATs) enabling rotation of rate-limited tokens. Negates the need of managing a GitHub App or bot user account per client.
  • Coarse-grained and highly configurable authorization of clients based on API resource scopes.
  • 100% compatible with GitHub's REST and GraphQL interfaces as well as the Enterprise API.

Install

Core framework:

$ pip install github-proxy

With support for Redis as a cache backend:

$ pip install github-proxy[redis]

With Flask support:

$ pip install github-proxy[flask]

Docker image: (Coming Soon)

$ docker pull babylonhealth/github-proxy

Usage

Flask integration (see example):

from github_proxy import blueprint

app = Flask(__name__)

app.register_blueprint(blueprint)
app.register_blueprint(
    blueprint, name="github_enterprise_proxy", url_prefix="/api/v3"
)  # enterprise server

if __name__ == "__main__":
    app.run()

Core framework (only needed for non-Flask applications):

# Explicitly construct a proxy instance
from github_proxy import Proxy, Config, CacheBackend, TelemetryCollector

config = Config()
proxy = Proxy(
    github_api_url=config.github_api_url,
    github_token_config=config,
    cache=CacheBackend.factory(config),
    rate_limited={},
    clients=config.clients,
    tel_collector=TelemetryCollector.from_type(config.tel_collector_type),
)

# Or inject an instance loaded from the environment
from github_proxy import inject_proxy
from werkzeug import Request, Response

@inject_proxy
def request_handler(proxy: Proxy, request: Request) -> Response:
    return proxy.request(request.path, request, "foo-client")

Registered clients (see client registry file) can then integrate with the proxy using their proxy client token. A proxy client token may be used as a regular GitHub PAT in SAML SSO Authentication:

$ curl -H "Authorization: token ${CLIENT_TOKEN}" http://localhost:5000/zen
Keep it logically awesome.

Architecture

The need for such a solution stemmed from Babylon's reliance on GitOps as an operational and change release framework. This led to a high (and at times abusive) usage of the GitHub API through a limited number of GitHub bot users. Frequent rate-limiting and lack of observability in terms of which client/workflow/team is abusing the API resulted in suboptimal developer experience.

High usage clients of the GitHub API are usually CI/CD pipelines and automated tests. These workflows are traditionally implemented as a collection of job processes executing independently to each other. This setup does not allow hot resources (and their Etags) to be shared across different workflows, or even jobs of the same workflow.

The GitHub-Proxy provides a centralised store of Etags that can be shared and re-used amongst its client base, letting workflows take full advantage of conditional requests which do not count against the rate limit.

graph LR
  subgraph clients
  A[CI job foo]
  C[CD job bar]
  D[Synthetic test baz]
  end
  A -- Auth via `Authorization: token REDACTED` HTTP header --> B[GitHub Proxy]
  C --> B
  D --> B
  B -- Auth via GitHub App or user PAT --> E[GitHub]
  B -- Read/Write responses of GitHub --> F[(Cache)]
  B -- Reporting of usage --> G[(Telemetry Collector)]

Sequence diagrams:

Configuring the proxy

By default, the proxy loads its configuration using the github_proxy.Config class from the following 2 sources:

  1. Environment variables
  2. Client registry file

Environment variables

Variable Description Default
GITHUB_API_URL Base url of the GitHub API server. https://api.github.com
CACHE_TTL The TTL (in seconds) of the cache that stores GitHub responses. 3600
CACHE_BACKEND_URL URI of the cache backend that stores GitHub responses. The scheme of the URI infers the cache backend type. inmemory://
GITHUB_CREDS_CACHE_MAXSIZE The max size of the inmemory cache used for storing rate limited GitHub credentials. 256
GITHUB_CREDS_CACHE_TTL_PADDING The TTL padding (in minutes) of the inmemory cache used for storing rate limited GitHub credentials. This padding accounts for potential clock drift between the proxy and the GitHub servers. 10
TELEMETRY_COLLECTOR_TYPE The type of telemetry collector to be used. noop
CLIENT_REGISTRY_FILE_PATH (Required) Path to the client registry file. See here for more. n/a
GITHUB_PAT_* Variable pattern to specify GitHub user PATs that the proxy can use when integrating with the GitHub API. Example variable name: GITHUB_PAT_FOO. n/a
GITHUB_APP_*_ID Variable pattern to specify GitHub App IDs that the proxy can use when integrating with the GitHub API. Example variable name: GITHUB_APP_BAR_ID. n/a
GITHUB_APP_*_INSTALLATION_ID Variable pattern to specify the GitHub App installation IDs that correspond to each of the GitHub App IDs. Example variable name: GITHUB_APP_BAR_INSTALLATION_ID. n/a
GITHUB_APP_*_PEM Variable pattern to specify the GitHub App private keys that correspond to each of the GitHub App IDs. Example variable name: GITHUB_APP_BAR_PEM. n/a

Client registry file

This file specifies the set of clients that are authorized to integrate with the proxy:

---
version: 1
clients:
  - name: test
    token: H+hYxlecgRq7yfmhq2COlJk7tpSwDmdsp8thdPsnbnQ=
  - name: read_only
    token: oed4+Uo4s4mgwstjSAY/N+HSOsGwfbX91QxqSOjsVlU=
    scopes:
    - method: GET
      path: .*
...

The tokens included in this file are the authorization tokens that clients need to pass to the proxy (instead of GitHub tokens). The name of each client should be unique and is to be used for telemetry purposes. The scopes define the resources and methods of the REST API that each of the clients is authorized to access (default to full access).

Tokens within this file must be treated as secrets. Since secrets cannot be commited to VCS, the registry file can also be provided as a Jinja2 template, enabling the injection of secrets at runtime through env variables:

version: 1
clients:
  - name: test
    token: {{ env.TOKEN_TEST }}

Extending the proxy

Adding a new type of cache backend:

from github_proxy import CacheBackend, CacheBackendConfig

class PostgresCacheBackend(CacheBackend, scheme="postgres"):
    # Implement the __init__, _get, _set, and _make_key methods
    # of the CacheBackend interface
    pass

Adding a new type of telemetry collector:

from github_proxy import TelemetryCollector

class JaegerTelemetryCollector(TelemetryCollector, type_="jaeger"):
    # Implement the collect_gh_response_metrics and collect_proxy_request_metrics 
    # methods of the TelemetryCollector interface
    pass

Once imported, the above extensions can be selected using the respective CACHE_BACKEND_URL and TELEMETRY_COLLECTOR_TYPE env variables.

Relevant references

  1. Google's magic GitHub proxy: Proxy that enables IAM for GitHub API tokens.
  2. Sourcegraph's GitHub proxy: Provides enhanced observability.
You might also like...
 Pancakeswap Sniper BOT - TORNADO CASH Proxy (MAC WINDOWS ANDROID LINUX) A fully decentralized protocol for private transactions
Pancakeswap Sniper BOT - TORNADO CASH Proxy (MAC WINDOWS ANDROID LINUX) A fully decentralized protocol for private transactions

TORNADO CASH Proxy Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX) ⭐️ A fully decentralized protocol for private transactions ⭐️ AUTO DOWNL

TORNADO CASH Proxy Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX)
TORNADO CASH Proxy Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX)

TORNADO CASH Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX) ⭐️ A ful

Simple Webhook Spammer with Optional Proxy Support
Simple Webhook Spammer with Optional Proxy Support

😎 �Simple Webhook Spammer with Optional Proxy Support:- [+] git clone https://g

a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.

ml-helper-functions a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.

Minimal telegram voice chat music bot, in pyrogram.

VCBOT Fully working VC (user)Bot, based on py-tgcalls and py-tgcalls-wrapper with minimal features. Deploying To heroku: Local machine/VPS: git clone

Send Informative, Concise Slack Notifications With Minimal Effort

slack-templates Send Informative, Concise Slack Notifications With Minimal Effort slack-templates Slack Integration Available Templates Usage Report t

Minimal API for the COVID Booking System of the Offices at the UniPD Math Dep

Simple and easy to use python BOT for the COVID registration booking system of the math department @ unipd (torre archimede). This API creates an interface with the official website, with more useful functionalities.

A free, minimal, lightweight, cross-platform, easily expandable  Twitch IRC/API bot.
A free, minimal, lightweight, cross-platform, easily expandable Twitch IRC/API bot.

parky's twitch bot A free, minimal, lightweight, cross-platform, easily expandable Twitch IRC/API bot. Features 🔌 Connect to Twitch IRC chat! 🔌 Conn

Lamblayer: a minimal deployment tool for AWS Lambda layers

lamblayer lamblayer is a minimal deployment tool for AWS Lambda layers. lamblayer does, Create a Layers of built pip-installable python packages. Crea

Comments
  • [CLOUD-5072] Porting the codebase to a new repo

    [CLOUD-5072] Porting the codebase to a new repo

    • Spike application
    • Spike
    • Add some todos
    • Remove injectors
    • Deployable service
    • Last modified headers (#3)
    • Introducing HTTP token auth (#4)
    • Enterprise server support (#5)
    • Log incoming cache headers (#6)
    • Using a lazy pool of GitHub credentials (#7)
    • Proxy should not fail if cache is down (#8)
    • Support TLS redis (#10)
    • Telemetry (#13)
    • Change custom prom metric names (#14)
    • Change metric labels (#15)
    • Integration tests + refactoring (#17)
    • Media type of a resource should be part of the cache key (#18)
    • Add healthcheck (#19)
    • Better separation between cache misses and uncacheable (#20)
    • Noop renaming of credentials to tokens (#21)
    • Query params to be used when indexing cached responses (#23)
    • Re-using TCP connections (#26)
    • Filter the whole set of hop-by-hop headers (#28)
    • Filter host and content headers (#29)
    • Authorize clients based on scopes (#30)
    • Improve docs (#31)
    • Remove promnight dependency from the github_proxy package
    • abstractmethods
    • Remove hard dependencies on redis and flask
    • Using a lazy pool of GitHub credentials (#7)
    • Telemetry (#13)
    • Integration tests + refactoring (#17)
    • Media type of a resource should be part of the cache key (#18)
    • Add healthcheck (#19)
    • Better separation between cache misses and uncacheable (#20)
    • Noop renaming of credentials to tokens (#21)
    • Query params to be used when indexing cached responses (#23)
    • Re-using TCP connections (#26)
    • Authorize clients based on scopes (#30)
    • Porting the codebase over to a new repo
    opened by dedoussis 0
Releases(v0.4.5)
  • v0.4.5(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Add cryptography dependency by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/7

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.4...v0.4.5

    Source code(tar.gz)
    Source code(zip)
  • v0.4.4(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/6

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.3...v0.4.4

    Source code(tar.gz)
    Source code(zip)
  • v0.4.3(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/5

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.2...v0.4.3

    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/4

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.0...v0.4.2

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix poetry dynamic versioning by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/3

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.3.0...v0.4.1

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix poetry dynamic versioning by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/3

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.3.0...v0.4.0

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix circleci tag filtering by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/2

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.2.0...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Porting the codebase to a new repo by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/1

    New Contributors

    • @dedoussis made their first contribution in https://github.com/babylonhealth/github-proxy/pull/1

    Full Changelog: https://github.com/babylonhealth/github-proxy/commits/v0.2.0

    Source code(tar.gz)
    Source code(zip)
Owner
Babylon Health
Putting an accessible and affordable health service in the hands of every person on earth.
Babylon Health
Add Me To Your Group Enjoy With Me. Pyrogram bot. https://t.me/TamilSupport

SongPlayRoBot 3X Fast Telethon Based Bot ⚜ Open Source Bot 👨🏻‍💻 Demo : SongPlayRoBot 💃🏻 Easy To Deploy 🤗 Click Below Image to Deploy DEPLOY Grou

IMVETRI 850 Dec 30, 2022
GitNews: Github webhooks for Telegram

GitNews - Github webhooks for Telegram Setup: server: clone repo git clone https

Druv Jagdish 1 Feb 14, 2022
A Telegram Video Merge Bot by @AbirHasan2005

VideoMerge-Bot This is very simple Telegram Videos Merge Bot by @AbirHasan2005. Using FFmpeg for Merging Videos. Features: Merge Multiple Videos. User

Abir Hasan 57 Nov 12, 2022
Modular Telegram bot running on Python

Modular Telegram bot running on Python

Jefanya Efandchris 1 Dec 26, 2021
ByDiego Token Grabber is a Discord Stealer

ByDiego Token Grabber is a Discord Stealer. This way you can get too much information from x person if you pass it on and open it

zByDiegoM.T 4 Mar 11, 2022
This is a Discord script that will provide a QR Code to your scholars for Axie Infinity.

DiscordQRCodeBot This is a Discord script that will provide a QR Code to your Axie Infinity scholars. Setup Run Ubuntu on AWS ec2 instance Dowloads al

ZracheSs | xyZ 24 Oct 05, 2022
A simpler way to make forms, surveys, and reaction input using discord.py.

discord-ext-forms An easier way to make forms and surveys in discord.py. This module is a very simple way to ask questions and create complete forms i

thrizzle 16 Nov 06, 2022
The worst but simplest webhook bot for GitHub and Matrix.

gh-bot gh-bot is maybe the worst (but simplest) Matrix webhook bot for Github. Example of commits: Example of workflow finished: Setting up Server You

Jae Lo Presti 4 Aug 18, 2022
🛰️ Scripts démontrant l'utilisation de l'imagerie RADARSAT-1 à partir d'un seau AWS | 🛰️ Scripts demonstrating the use of RADARSAT-1 imagery from an AWS bucket

🛰️ Scripts démontrant l'utilisation de l'imagerie RADARSAT-1 à partir d'un seau AWS | 🛰️ Scripts demonstrating the use of RADARSAT-1 imagery from an AWS bucket

Agence spatiale canadienne - Canadian Space Agency 4 May 18, 2022
Python client for the iNaturalist APIs

pyinaturalist Introduction iNaturalist is a community science platform that helps people get involved in the natural world by observing and identifyin

Nicolas Noé 79 Dec 22, 2022
Python bindings for LibreTranslate

Python bindings for LibreTranslate

Argos Open Tech 42 Jan 03, 2023
Mega.nz to GDrive uploader

Mega.nz to GDrive uploader With this telegram bot you can download files from mega.nz and upload those files or telegram uploaded files to GDrive. You

30 Nov 13, 2022
Python API for working with RESQML models

resqpy: Python API for working with RESQML models Introduction resqpy is a pure python package which provides a programming interface (API) for readin

BP 44 Dec 14, 2022
Fun telegram bot =)

Recolor Bot About Fun telegram bot, that can change your hair color. Preparations Update package lists sudo apt-get update; Make sure Git and docker-c

Just Koala 4 Jul 09, 2022
Super simple anti-spam Discord bot

AutoAntiRaidBot Super simple anti-spam Discord bot. Will automatically kick any member with an account made under 1 day ago, and will ban any member w

Kainoa Kanter 6 Jun 27, 2022
This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

AWS Data Engineering Pipeline This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For

15 Jul 28, 2021
Shiny Wechat Pay SDK for Python

WeChat third-party Python SDK master: Read the Documentation Features Common public platforms passively respond and actively call APIs WeChat Pay API

Obrisk 18 Sep 05, 2022
A Tool to scrape URLs for a given domain from wayback machine, Commoncrawl and OTX Alienvault

Mr_URL Mr.URL fetches known URLs for a given domain from Wayback Machine, Commoncrawl and OTX Alienvault. It also finds old versions of any given URL

Stinger 9 Sep 05, 2022
A Telegram Bot written in Python for mirroring files on the Internet to your Google Drive

No support is going to be provided of any kind, only maintaining this for vps user on request. This is a Telegram Bot written in Python for mirroring

Sunil Kumar 42 Oct 28, 2022
It's a discord.py simulator.

DiscordPySimulator It's a discord.py simulator. ⚠️ Things to fix Context As you may know, discord py commands provide the context as the first paramet

Juan Sebastián 11 Oct 24, 2022