Skip to content

jamesob/bitcoin-github-scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data.

Each output/<pr number> folder contains

  • comments.json: an aggregated list of both issue and review comments, in Github's original format
  • commits.json: a list of commit objects corresponding to the PR, in Github's original format
  • pr.json: the pull request object, in Github's original format
  • comments_abbrev.csv: abbreviated representation of each comment in CSV format
  • pr_abbrev.csv: abbreviated representation of the PR in CSV format
  • done: the datetime we retrieved the PR data

Limitations

Right now this doesn't really handle open PRs (or PRs that are expected to be updated) properly since it will not refresh data once the done sentinel is created. This could be fixed by comparing various timestamps to the done sentinel and overwriting.

See also

About

Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published