Tools for collecting social media data around focal events

Overview

Social Media Focal Events

The focalevents codebase provides tools for organizing data collected around focal events on social media.

It is often difficult to organize data from multiple API queries. For example, we may collect tweets when a hashtag starts trending by using Twitter’s filter stream. Later, we may make a separate query to the search endpoint to backfill our stream with what we missed before we started it, or update it with tweets that occurred since we stopped it. We may also want to get reply threads, quote tweets, or user timelines based on the tweets we collected. All of these queries are related to a common focal event—the hashtag—but they require several separate calls to the API. It is easy for these multiple queries to result in many disjoint files, making it difficult to organize, merge, update, backfill, and preprocess them quickly and reliably.

To address these issues, focalevents can be used to organize social media focal event data collected from Twitter’s v2 API using academic credentials and PostgreSQL. It is easy to do any of the following with the tools here:

  • Query Twitter’s full archive or filter stream for focal event data
  • Backfill and update those queries with additional data
  • Collect conversation threads and quote tweets of focal event tweets
  • Retrieve full user timelines for any user tweeting during a focal event

All of these functionalities are easy, single line commands, rather than long multi-line scripts, as are typically needed to read IDs, query the API, output data, and merge it with existing data. This allows researchers to design more complex studies of social media data, and spend more time focusing on data analysis, rather than data storage and maintenance.

Installation and Documentation

The repository's code can be downloaded directly from Github, or cloned using git:

git clone https://github.com/ryanjgallagher/focalevents

See the full documentation for more information about installing, configuring, and using the focalevent tools.

A Note

The code here is written and maintained by a single person. First and foremost, it has been designed to help them manage their own data and create replicable pipelines. They are sharing it in the hope that it may help others who have similar workflows and are interested in organizing their Twitter data according to focal events using PostgreSQL.

Requests for enhancements or additions to the code will likely be declined if the author does not anticipate using them in their own research. It is highly unlikely that the code will ever be adapted to work with databases other than PostgreSQL. Further, general problems with database setup or conflicts with pre-existing database structures are beyond the scope of this project and will not be addressed.

Owner
Ryan Gallagher
Network science PhD student merging networks and NLP for computational social science
Ryan Gallagher
synchronize projects via yaml/json manifest. built on libvcs

vcspull - synchronize your repos. built on libvcs Manage your commonly used repos from YAML / JSON manifest(s). Compare to myrepos. Great if you use t

python utilities for version control 200 Dec 20, 2022
Hitchhikers-guide - The Hitchhiker's Guide to Data Science for Social Good

Welcome to the Hitchhiker's Guide to Data Science for Social Good. What is the Data Science for Social Good Fellowship? The Data Science for Social Go

Data Science for Social Good 907 Jan 01, 2023
LINUX-AOS (Automatic Optimization System)

LINUX-AOS (Automatic Optimization System)

1 Jul 12, 2022
Aerial Ace is a helper bot for poketwo which provide various functionalities on top of being a pokedex.

Aerial Ace is a helper bot for poketwo which provide various functionalities on top of being a pokedex.

Devanshu Mishra 1 Dec 01, 2021
一个Graia-Saya的插件仓库

一个Graia-Saya的插件仓库 这是一个存储基于 Graia-Saya 的插件的仓库 如果您有这类项目

ZAPHAKIEL 111 Oct 24, 2022
Программа для практической работы №12 по дисциплине

Информатика: программа для практической работы №12 Код и блок-схема программы для практической работы №12 по дисциплине "Информатика" (I семестр). Сут

Vladislav 1 Dec 07, 2021
Python bilgilerimi eğlenceli bir şekilde hatırlamak ve daha da geliştirmek için The Big Book of Small Python Projects isimli bir kitap almıştım.

Python bilgilerimi eğlenceli bir şekilde hatırlamak ve daha da geliştirmek için The Big Book of Small Python Projects isimli bir kitap almıştım. Bu repo kitaptaki örnek programları çalıştığım oyun al

Burak Selim Senyurt 22 Oct 26, 2022
Python MapReduce library written in Cython.

Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.

Brandyn White 243 Sep 16, 2022
Excel cell checker with python

excel-cell-checker Description This tool checks a given .xlsx file has the struc

Paul Aumann 1 Jan 04, 2022
"Cambio de monedas" Change-making problem with Python, dynamic programming best solutions,

Change-making-problem / Cambio de monedas Entendiendo el problema Dada una cantidad de dinero y una lista de denominaciones de monedas, encontrar el n

Juan Antonio Ayola Cortes 1 Dec 08, 2021
Exploiting Linksys WRT54G using a vulnerability I found.

Exploiting Linksys WRT54G Exploit # Install the requirements. pip install -r requirements.txt ROUTER_HOST=192.169.1.1 ROUTER_USERNAME=admin ROUTER_P

Elon Gliksberg 31 May 29, 2022
Run PD patches in NRT using Python

The files in this repository demonstrate how to use Pure Data (Pd) patches designed to run in Non-Real-Time mode to batch-process (synthesize, analyze, etc) sounds in series using Python.

Jose Henrique Padovani 3 Feb 08, 2022
A person does not exist image bot

A person does not exist image bot

Fayas Noushad 3 Dec 12, 2021
Superset custom path for python

It is a common requirement to have superset running under a base url, (https://mydomain.at/analytics/ instead of https://mydomain.at/). I created the

9 Dec 14, 2022
An audnexus client, providing rich author and audiobook data to Plex via it's legacy plugin agent system.

Audnexus.bundle An audnex.us client, providing rich author and audiobook data to Plex via it's legacy plugin agent system. 📝 Table of Contents About

David Dembeck 248 Jan 02, 2023
Async Python Circuit Breaker implementation

aiocircuitbreaker This is an async Python implementation of the circuitbreaker library. Installation The project is available on PyPI. Simply run: $ p

5 Sep 05, 2022
An electron application to check battery of bluetooth devices connected to linux devices.

bluetooth-battery-electron An electron application to check battery of bluetooth devices connected to linux devices. This project provides an electron

Vasu Sharma 15 Dec 03, 2022
Plugin to manage site, circuit and device diagrams and documents in Netbox

Netbox Documents Plugin A plugin designed to faciliate the storage of site, circuit and device specific documents within NetBox Note: Netbox v3.2+ is

Jason Yates 38 Dec 24, 2022
OWASP Foundation Web Respository

WWWGrep OWASP Foundation Web Respository Author: Mark Deen & Aditi Mohan Introduction WWWGrep is a rapid search “grepping” mechanism that examines HTM

OWASP 34 Jun 15, 2022
New multi tool im making adding features currently

Emera Multi Tool New multi tool im making adding features currently Current List of Planned Features - Linkvertise Bypasser - Discord Auto Bump - Gith

Lamp 3 Dec 03, 2021