Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Last update: Oct 12, 2022

Related tags

Web Crawling bitcoin-github-scrape

Overview

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data.

Each output/<pr number> folder contains

comments.json: an aggregated list of both issue and review comments, in Github's original format
commits.json: a list of commit objects corresponding to the PR, in Github's original format
pr.json: the pull request object, in Github's original format
comments_abbrev.csv: abbreviated representation of each comment in CSV format
pr_abbrev.csv: abbreviated representation of the PR in CSV format
done: the datetime we retrieved the PR data

Limitations

Right now this doesn't really handle open PRs (or PRs that are expected to be updated) properly since it will not refresh data once the done sentinel is created. This could be fixed by comparing various timestamps to the done sentinel and overwriting.

Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Related tags

Overview

Limitations

See also

Owner

James O'Beirne

Telegram Group Scrapper

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Lovely Scrapper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

京东茅台抢购最新优化版本，京东茅台秒杀，优化了茅台抢购进程队列

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

Console application for downloading images from Reddit in Python

This repo has the source code for the crawler and data crawled from auto-data.net

A crawler of doubamovie

Web Scraping Practica With Python

Python framework to scrape Pastebin pastes and analyze them

A database scraper created with mechanical soup and sqlite

fork huanghyw/jd_seckill

A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

A pure-python HTML screen-scraping library

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

12306抢票脚本

Binance Smart Chain Contract Scraper + Contract Evaluator