Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.

Overview

Whale Demo Instance: Bigquery Public Data

This is a fully-functioning demo instance of the whale data catalog, actively scraping data from Bigquery's public project bigquery-public-data using github actions.

To test out this repo with your own local installation of whale (i.e. to emulate what it'd be like to set up whale on github for your own team), clone the repo to your ~/.whale directory (if you already have a ~/.whale directory, move it or delete it with rm -rf ~/.whale or the clone won't work):

git clone https://github.com/dataframehq/whale-bigquery-public-data ~/.whale

Then install whale and run the following commands, following the prompts:

wh git-enable
wh schedule

At this point, if you run wh pull, it should run a git pull --autostash --rebase against this repo (meaning any locally scheduled cron jobs will simply pull down fresh metadata from this repo, rather than scraping directly from Bigquery).

For more information on how to set this up for your own warehouse, see the docs.

FAQ

Why doesn't wh.pull() work locally?

While wh pull (the CLI hook) will check for a flag in wh config and act appropriately (sourcing from github if is_git_etl_enabled is True, and from the connections in wh connections if not), wh.pull() (the python hook) performs no such check. This is by design, to ensure the remote repository's associated CI/CD pipelines to pull down data directly from the metadata source, by default (we suspect most people do not want to refresh metadata using the python client, but feel free to open an issue if you disagree).

Locally, this will fail, unless you modify the key_path value specified in wh connections to some credentials you have stored locally. If you choose to do this, ensure you have the following permissions enabled in the associated service account: BigQuery Data Viewer, BigQuery Job User, BigQuery Metadata Viewer.

Owner
Hyperquery
All-in-one workspace for data analytics
Hyperquery
A code to match you with the perfect Taylor Swift song for your mood and relationship status.

taylorswift A package for matching your current mood and relationship status to a suitable Taylor Swift song. Requirements: Python 2 or 3, and the num

Megan Mansfield 82 Dec 09, 2022
Discord rich-presence implementation for VALORANT

not working on v1 anymore in favor of v2, but if there's any big bugs i'll try to fix them valorant-rich-presence-client Discord rich presence extensi

colinh 278 Jan 08, 2023
Discord Rich Presence implementation for Plex.

Perplex Perplex is a Discord Rich Presence implementation for Plex. Features Modern and beautiful Rich Presence for both movies and TV shows The Movie

Ethan 52 Dec 19, 2022
Media Replay Engine (MRE) is a framework to build automated video clipping and replay (highlight) generation pipelines for live and video-on-demand content.

Media Replay Engine (MRE) is a framework for building automated video clipping and replay (highlight) generation pipelines using AWS services for live

Amazon Web Services - Labs 30 Nov 29, 2022
Diablo II Resurrected Diablo Clone Running Room Mgr

d2rdc Diablo II Resurrected Diablo Clone Running Room Mgr Install Dependencies pip install fastapi pip install uvicorn Running uvicorn init:app INFO:

1 Dec 03, 2021
AnyAPI is a library that helps you to write any API wrapper with ease and in pythonic way.

AnyAPI AnyAPI is a library that helps you to write any API wrappers with ease and in pythonic way. Features Have better looking code using dynamic met

Fatih Kilic 129 Sep 20, 2022
A python script to download twitter space, only works on running spaces (for now).

A python script to download twitter space, only works on running spaces (for now).

279 Jan 02, 2023
Linky bot, A open-source discord bot that allows you to add links to ur website, youtube url, etc for the people all around discord to see!

LinkyBot Linky bot, An open-source discord bot that allows you to add links to ur website, youtube url, etc for the people all around discord to see!

AlexyDaCoder 1 Sep 20, 2022
Twitter-bot - A Simple Twitter bot with python

twitterbot To use this bot, You will require API Key and Access Key. Signup at h

Bentil Shadrack 8 Nov 18, 2022
RChecker - Checker for minecraft servers

🔎 RChecker v1.0 Checker for Minecraft Servers 💻 Supported operating systems: ✅

Pedro Vega 1 Aug 30, 2022
Fully Automated Omegle Chatbot

omegle-bot tutorial features fast runs in background can run multiple instances at once Requirement Run this command in cmd, terminal or PowerShell (i

6 Aug 07, 2021
A course on getting started with the Twitter API v2 for academic research

Getting started with the Twitter API v2 for academic research Welcome to this '101 course' on getting started with academic research using the Twitter

@TwitterDev 426 Jan 04, 2023
Generate and Visualize Data Lineage from query history

Tokern Lineage Engine Tokern Lineage Engine is fast and easy to use application to collect, visualize and analyze column-level data lineage in databas

Tokern 237 Dec 29, 2022
Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance frameworks.

aws-allowlister Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance fr

Salesforce 189 Dec 08, 2022
Save data from Instagram takeout to a SQLite database

instagram-to-sqlite Save data from a Instagram takeout to a SQLite database. Mise En Place git clone https://github.com/gavindsouza/instagram-to-sqlit

gavin 8 Dec 13, 2022
Music bot for playing music on telegram voice chat group.

Somali X Music 🎵 Music bot for playing music on telegram voice chat group. Requirements FFmpeg NodeJS nodesource.com Python 3.8+ or Higher PyTgCalls

Abdisamad Omar Mohamed 4 Dec 01, 2021
Herramienta para transferir eventos de Sucuri WAF hacia Azure Monitor Log Analytics.

Transfiere eventos de Sucuri hacia Azure LogAnalytics Script para transferir eventos del Sucuri Web Application Firewall (WAF) hacia Azure LogAnalytic

CSIRT-RD 1 Dec 22, 2021
This Bot Can Upload Video from Link Of Pdisk to Pdisk using its API. @PredatorHackerzZ

𝐏𝐝𝐢𝐬𝐤 𝐂𝐨𝐧𝐯𝐞𝐫𝐭𝐞𝐫 𝐁𝐨𝐭 Make short link by using 𝐏𝐝𝐢𝐬𝐤 API key Installation 𝐓𝐡𝐞 𝐄𝐚𝐬𝐲 𝐖𝐚𝐲 𝐑𝐞𝐪𝐮𝐢𝐫𝐞𝐝 𝐕𝐚𝐫𝐢𝐚𝐛𝐥𝐞

ρяє∂αтσя 25 Dec 02, 2022
Davide Gallitelli 3 Dec 21, 2021
Prabashwara's Pm Bot repository. You can deploy and edit this repository.

Tᴇʟᴇɢʀᴀᴍ Pᴍ Bᴏᴛ | Prabashwara's PM Bot Unmaintained. The new repo of @Pm-Bot is private. (It is no longer based on this source code. The completely re

Rivibibu Prabshwara Ⓒ 2 Jul 05, 2022