ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.

Overview

Collaborate

ProPublica Google News Initiative

This is a web application for managing and building stories based on tips solicited from the public. This project is meant to be easy to setup for non-programmer, intuitive to use and highly extendable.

Here are a few use cases:

  • Collection of data from various sources (Google Form via Google Sheets, Screendoor, Private Google Spreadsheets)
  • An easy to setup data entry system
  • Organizing data from multiple sources and allowing many users to view and annotate it

The project is broken up into several components:

  • A system for transforming CSV files into managed database records
  • A default and automatic Django admin panel built for rapid and easy editing, managing and browsing of data
  • Customizable fields for tagging, querying, annotating and tracking tips

This is a project of ProPublica, supported by the Google News Initiative.

Documentation

We have a GitBook with a full user guide that covers running Collaborate, importing and refining data, and setting up Google services. You can read the documentation here.

Deploy it

Collaborate has builtin support for one-click installs in both Google Cloud and Heroku. During the setup process for both deployments, make sure to fill in the email, username and password fields so you can log in.

Heroku

Deploy

The Heroku deploy button will create a small, "free-tier" Collaborate system. This consists of a small web server, a database which supports between 10k-10M records (depending on data size) and automatically configures scheduled data re-importing.

Google Cloud

Run on Google Cloud

The Google Cloud Run button launches Collaborate into the Google Cloud environment. This deploy requires you to setup a Google Project, enable Google Cloud billing and enable the Cloud Run API. Full set up instructions are here.

This deploy does not automatically configure scheduled re-importing, but you can add it via Cloud Scheduler by following these instructions.

Once you've deployed your Cloud Run instance, you can manage your running instance from the Google Developer's Console.

Getting Started (Local Testing/Development)

Getting the system set up and running locally begins with cloning this repository and installing the Python dependencies. Python 3.6 or 3.7 and Django 2.2 are assumed here.

# virtual environment is recommended
mkvirtualenv -p /path/to/python3.7 collaborative
# install python dependencies
pip install -r requirements.txt

Assuming everything worked, let's bootstrap and then start the local server:

# get the database ready
python manage.py migrate

# create a default admin account
python manage.py createsuperuser

# gather up django and collaborate assets
python manage.py collectstatic --noinput

# start the local application
python manage.py runserver

You can then access the application http://localhost:8000 and log in with the credentials you selected in the createsuperuser step (above). Logging in will bring you to a configuration wizard where you will import your first Google Sheet and import its contents.

Production Deploy (Nginx/Docker)

If you want to deploy this to a production environment, we've included configuration templates and scripts for Docker and Nginx.

A Collaborate Dockerfile (the same one used by the Google Cloud Run deploy) can be found here:

deploy/google-cloud/Dockerfile

This creates a basic production environment with nginx and gunicorn. By default, it uses SQLite3, but you can configure the database by adding a DATABASE_URL environment variable. You can read more about the format for this variable here.

We also included a configuration script for plain Nginx deploys here:

deploy/google-cloud/django_nginx.conf

This can be copied to your main Nginx sites configuration directory (e.g., /etc/nginx/sites-available/).

In order to get auto-updating data sources, make sure to add a cron job that runs the following manage.py command:

manage.py refresh_data_sources

There's an example cron file that, when added to your /etc/crontab, will update data every 15 minutes:

./deploy/cron/refresh_data_sources

Note that if you use the above example, you probably want to add logrotate for the logfile the above cron config adds. You can find the logrotate script here (add it to /etc/logrotate.d/refresh_data_sources):

./deploy/logrotate/refresh_data_sources
Owner
ProPublica
Journalism in the Public Interest
ProPublica
Fava - web interface for Beancount

Fava is a web interface for the double-entry bookkeeping software Beancount with a focus on features and usability. Check out the online demo and lear

1.5k Dec 30, 2022
ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.

Collaborate This is a web application for managing and building stories based on tips solicited from the public. This project is meant to be easy to s

ProPublica 86 Oct 18, 2022
Automatic music downloader for SABnzbd

Headphones Headphones is an automated music downloader for NZB and Torrent, written in Python. It supports SABnzbd, NZBget, Transmission, µTorrent, De

3.2k Dec 31, 2022
One webpage for every book ever published!

Open Library Open Library is an open, editable library catalog, building towards a web page for every book ever published. Are you looking to get star

Internet Archive 4k Jan 08, 2023
Plugin-based, unopinionated membership administration software

byro is a membership administration tool for small and medium sized clubs/NGOs/associations of all kinds, with a focus on the DACH region. While it is

123 Nov 16, 2022
Automatic Movie Downloading via NZBs & Torrents

CouchPotato CouchPotato (CP) is an automatic NZB and torrent downloader. You can keep a "movies I want"-list and it will search for NZBs/torrents of t

CouchPotato 3.9k Jan 04, 2023
cherrytree

CherryTree A hierarchical note taking application, featuring rich text and syntax highlighting, storing data in a single XML or SQLite file. The proje

Giuseppe Penone 2.7k Jan 08, 2023
WikidPad is a single user desktop wiki

What is WikidPad? WikidPad is a Wiki-like notebook for storing your thoughts, ideas, todo lists, contacts, or anything else you can think of to write

WikidPad 176 Dec 14, 2022
115原码播放服务Kodi插件

115proxy-for-kodi 115原码播放服务Kodi插件,需要kodi 18以上版本,需配合 https://github.com/feelfar/115-for-kodi 使用 安装 由于release包尚未释出,可直接下载源代码zip包安装。 20210202:由于正调试kodi19兼

92 Jan 01, 2023
A time tracking application

GTimeLog GTimeLog is a simple app for keeping track of time. Contents Installing Documentation Resources Credits Installing GTimeLog is packaged for D

GTimeLog developers 224 Nov 28, 2022
This is your launchpad that comes with a variety of applications waiting to run on your kubernetes cluster with a single click

This is your launchpad that comes with a variety of applications waiting to run on your kubernetes cluster with a single click.

M. Rehan 2 Jun 26, 2022
The official source code repository for the calibre ebook manager

calibre calibre is an e-book manager. It can view, convert, edit and catalog e-books in all of the major e-book formats. It can also talk to e-book re

Kovid Goyal 14.1k Dec 27, 2022
Invenio digital library framework

Invenio Framework v3 Open Source framework for large-scale digital repositories. Invenio Framework is like a Swiss Army knife of battle-tested, safe a

Invenio digital repository framework 562 Jan 07, 2023
SENAITE Meta Package

SENAITE LIMS Meta Installation Package What does SENAITE mean? SENAITE is a beautiful trigonal, oil-green to greenish black crystal, with almost the h

SENAITE 135 Dec 14, 2022
A Python library to manage ACBF ebooks.

libacbf A Python library to read and edit ACBF formatted comic book files and archives. XML Specifications here: https://acbf.fandom.com/wiki/Advanced

Grafcube 0 Nov 09, 2021
:mag: Ambar: Document Search Engine

🔍 Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am

RD17 1.9k Jan 09, 2023
A :baby: buddy to help caregivers track sleep, feedings, diaper changes, and tummy time to learn about and predict baby's needs without (as much) guess work.

Baby Buddy A buddy for babies! Helps caregivers track sleep, feedings, diaper changes, tummy time and more to learn about and predict baby's needs wit

Baby Buddy 1.5k Jan 02, 2023
A collection of self-contained and well-documented issues for newcomers to start contributing with

fedora-easyfix A collection of self-contained and well-documented issues for newcomers to start contributing with How to setup the local development e

Akashdeep Dhar 8 Oct 16, 2021
The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim through format.

The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim

Pinry 2.7k Jan 08, 2023
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Exclusiv

pyMedusa 1.5k Dec 30, 2022