A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

Overview

repo_link_extractor

A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

---- NOTE: JUST STARTED ONLY AN IDEA TO COME BACK TO ----

Summary

first minimum viable product goal

The first minimum viable product goal will be to harvest from https://github.com/softwareunderground/awesome-open-geoscience/blob/main/README.md all the github repositories URLs such that the form returned is https://github.com/ + "username" + "repository name" and then add them to the "repos" key in an existing JSON in a form like this: https://github.com/softwareunderground/open_geosciene_code_projects_viz/blob/main/_explore/input_lists.json , which is summarized below:

{
    "memberOrgs": [
        "softwareunderground"
    ],
    "orgs": [
        "agile-geoscience",
        "softwareunderground"
    ],
    "repos": [
        "ahotovec/redpy",
        "whamlyn/auralib"
    ]
}

2nd intermediate product goal

  • Fires from a GitHubAction

3rd intermediate product goal

Eventual product goal

  • works for public, internal, and private GitHub URLs
  • Works for GitHub, GitLab, BitBucket, and other code repository URLS & APIs
  • Keeps track of harvest date, source file name, source file URL & code platform & domain in an intermediate file.

Related Projects

This is referenced on an issue here: https://github.com/softwareunderground/open_geosciene_code_projects_viz/issues/23

Potential Useful Bits

regular expression (https:\/\/github.com\/)\w+(\/)\w+ seems like a good starting point for the extraction of Github URLs.

GitHub Actions Structure Tentative:

  • download README file
  • replace old README file with new
  • extract all links matching a regular expression
  • sort & take out duplicates
  • make into JSON with domain, URL, org or username, repository name, source file name, source file link, and date of harvests
  • pull out org or username & repository name from above and put into appropriate key of the file JSON if not already there in either org or repo keys.

How to Integrate into https://github.com/softwareunderground/open_geosciene_code_projects_viz ??????

Options:

  1. Put all of the code here into the repository: https://github.com/softwareunderground/open_geosciene_code_projects_viz
  2. Call the code here from https://github.com/softwareunderground/open_geosciene_code_projects_viz
If calling the code....
  • (1) add the script to read the README to MASTER.sh as the first step
  • (2) set master.sh to be callled by GitHub actions
  • (3) when triggered the github actions does the entirity of the github actions in this repo, including calling the python scripts as its first step.
  • (4) latter steps include setting up the environnment and calling all the python scripts that the master.sh bash script calls. The code would need to be called by either a GitHub Action on (pull request, push, manual, or cron job) or by trigger after the call to refresh the
Owner
Justin Gosses
Machine-Learning | Data Visualization | Geoscience | NASA |
Justin Gosses
A telegram bot to track whales activities on multiple blockchains.

Telegram Bot : Whale Watcher A straightforward telegram bot written in python to track whales activity on multiple blockchains, using whale-alert API

Laurenz Bougan 1 Dec 10, 2021
Find the best repos to contribute to, right from Discord!

repo-finder-bot Find the best repos to contribute to, right from Discord! Add to your server FAQs Hmm. What's this? This is the Repo Finder Bot, a bot

Skyascii 61 Dec 25, 2022
Easily report Instagram pages and close the page

Program Features - 📌 Delete target post on Instagram. - 📌 Delete Media Target post on Instagram - 📌 Complete deletion of the target account on Inst

hack4lx 11 Nov 25, 2022
Bitbucket Server API Wrapper

A simple wrapper for the Atlassian's Bitbucket Server / Bitbucket Datacenter (formerly Stash) REST API, written in Python.

Schweitzer Engineering Laboratories 4 Jan 06, 2023
🔍 Google Search unofficial API for Python with no external dependencies

Python Google Search API Unofficial Google Search API for Python. It uses web scraping in the background and is compatible with both Python 2 and 3. W

Avi Aryan 204 Dec 28, 2022
Discord Bot that can translate your text, count and reply to your messages with a personalised text

Discord Bot that can translate your text, count and reply to your messages with a personalised text

Grizz 2 Jan 26, 2022
Clipboard-watcher - Keep an eye on the apps that are using your clipboard

clipboard-watcher This repository contains the code of an experiment, in order t

Gonçalo Valério 48 Oct 13, 2022
Discord bots that update their status to the price of any coin listed on x.vite.net

Discord bots that update their status to the price of any coin listed on x.vite.net

5am 3 Nov 27, 2022
Sunflower-farmers-automated-bot - Sunflower Farmers NFT Game automated bot.IT IS NOT a cheat or hack bot

Sunflower-farmers-auto-bot Sunflower Farmers NFT Game automated bot.IT IS NOT a

Arthur Alves 17 Nov 09, 2022
Music bot for playing music on telegram voice chat group.

Somali X Music 🎵 Music bot for playing music on telegram voice chat group. Requirements FFmpeg NodeJS nodesource.com Python 3.8+ or Higher PyTgCalls

Abdisamad Omar Mohamed 4 Dec 01, 2021
All in one Search Engine Scrapper for used by API or Python Module. It's Free!

All in one Search Engine Scrapper for used by API or Python Module. How to use: Video Documentation Senginta is All in one Search Engine Scrapper. Wit

33 Nov 21, 2022
Remedy when Amazon ECR is not running basic scans for container CVEs.

Welcome to your CDK Python project! This is a blank project for Python development with CDK. The cdk.json file tells the CDK Toolkit how to execute yo

4n6ir 4 Nov 05, 2022
A discord bot thet lets you play Space invaders.

space_Invaders A discord bot thet lets you play Space invaders. It is my first discord bot... so please give any suggestions to improve it :] Commands

2 Dec 30, 2021
Discord bot to display private leaderboards for Advent of Code.

Advent Of Code Discord Bot Discord bot for displaying Advent of Code private leardboards, as well as custom leaderboards where participants can set th

The Future Gadgets Lab 6 Nov 29, 2022
🤖 A fully featured, easy to use Python wrapper for the Walmart Open API

Wapy Wapy is a fully featured Python wrapper for the Walmart Open API. Features Easy to use, object oriented interface to the Walmart Open API. (Produ

Carlos Roso 43 Oct 14, 2022
NFTs Upload to OpenSea CuseEdition

NFTs-Upload-to-OpenSea-CuseEdition YOUTUBE VIDEO - Soon... Download Python and

Lil Cuse 2 Jan 04, 2022
Construindo API's robustas utilizando Python

🐂 Construindo API's robustas utilizando Python Neste tutorial vamos aprender a construir API's utilizando Python e FastAPI, integrá-las a serviços ex

luizalabs 296 Dec 13, 2022
A custom rom post bot for Telegram.

Rom Poster Bot A simple Post Bot written in Python using pyTelegramBotAPI to post rom updates to telegram whenever you need. Made by lazy peep for laz

Prajwal 6 Nov 03, 2022
This repository provides a set functions to extract paragraphs from AWS Textract responses.

extract-paragraphs-with-aws-textract Since AWS Textract (the AWS OCR service) does not have a native function to extract paragraphs, this repository p

Juan Anzola 3 Jan 26, 2022
Pycord, a maintained fork of discord.py, is a python wrapper for the Discord API

pycord A fork of discord.py. PyCord is a modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python. Key Features Mo

Pycord Development 2.3k Dec 31, 2022