Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Overview

data-preprocessing_toogoodtogo_threatlines

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz, Stef van Buuren, and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Synchronous visualisation of email threads

Publications from the FTM "Dossier SHELL papers" https://www.ftm.nl/dossier/shell-papers suggest that timing of events is critical in the interactions between actors. It would therefore be useful if we could visualise the mail exchanges in time.

The idea is to visualise threads of mail exchanges between actors over time. When this is done for multiple threads, the display would give rapid insight into the structure and timing of exchanges between actors. For example, suppose we are able to construct a single thread from "RE:" and "FW:" mails in the data. A simple visualisation would be

See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9825&rep=rep1&type=pdf for variations on this display, for example by adding the interactions between the actors by fancy arcs and resorting the mails according to actor pairs.

A generalisation to multiple simulataneous threads would stack multiple lines, similar to a dot plot. Such a design calls for relatively simple thread displays that are synchronised in time. Therefore we will concentrate on using a simple thread line that plots mail chronology against calender time.

A somewhat grander idea would be to create a "film of events". The user would place a cursor on the time axis, and scroll through time. The new information per mail is displayed as the cursor passes the send time of the email.

Issues to resolve

We need complex/advanced text processing. Some of the issues include:

  1. How can we split multiple emails in a RE/FW into a set of elementary mails, each corresponding to just one sender?
  2. How well can we form threads by matching on subject lines?
  3. Do duplicates extracted from RE/FW serve any useful purpose?
  4. What is the percentage of threads for which we can find the parent mail (the mail that started the thread)?

Experiment 1

The first design plots all thread lines between 2016 and 2020 on one chart.

Experiment 2

The second design uses trelliscopejs to plot the same information in smaller pieces.

The user can switch between 27 panes, each containing about 20 threads.

Try out the interactive version

Experiment 3

Back to figure 1, but now plotted with rbokeh, so that we may zoom and use tooltips (interaction not supported by GitHub markdown)

Owner
ASReview hackathon for Follow the Money
ASReview hackathon for Follow the Money
Gives you more advanced math in python.

AdvancedPythonMath Gives you more advanced math in python. Functions .simplex(args: {number}) .circ(args: {raidus}) .pytha(args: {leg_a + leg_2}) .slo

Voidy Devleoper 1 Dec 25, 2021
A program for calculating the divisor function

DivisorsFunctionCalculator A program for calculating the divisor function A script to find the "Sigma" (divisors function) of any number. To find the

1 Oct 31, 2021
Korg Volca Sample uploader for linux.

GnuVolca Korg Volca Sample uploader for linux. GnuVolca Usage Installation Via virtualenv Usage Store all the samples you want to upload on an empty d

Gonzalo Rafuls 12 Oct 11, 2022
This is an API to get user details for competitive coding platforms - Codeforces, Codechef, SPOJ, Interviewbit. More Platform will be Added Soon.

Competitive-Programming-Score-API An API to get user details for competitive coding platforms - Codeforces, Codechef, SPOJ, Interviewbit Platforms Ava

Aaditya Prakash 3 Jan 17, 2022
Never miss a deadline again

Hack the Opportunities Never miss a deadline again! Link to the excel sheet Contribution This list is not complete and I alone cannot make it whole. T

Vibali Joshi 391 Dec 28, 2022
A minimal configuration for a dockerized kafka project.

Docker Kafka Quickstart A minimal configuration for a dockerized kafka project. Usage: Run this command to build kafka and zookeeper containers, and c

Nouamane Tazi 5 Jan 12, 2022
This repo holds custom callback plugin, so your Ansible could write everything in the PostgreSQL database.

English What is it? This is callback plugin that dumps most of the Ansible internal state to the external PostgreSQL database. What is this for? If yo

Sergey Pechenko 19 Oct 21, 2022
A dead-simple service that notifies you when something goes down.

Totmannschalter Totmannschalter (German for dead man's switch) is a simple service that notifies you when it has not received any message from a servi

1 Dec 20, 2021
A simple and efficient computing package for Genshin Impact gacha analysis

GGanalysisLite计算包 这个版本的计算包追求计算速度,而GGanalysis包有着更多计算功能。 GGanalysisLite包通过卷积计算分布列,通过FFT和快速幂加速卷积计算。 测试玩家得到的排名值rank的数学意义是:与抽了同样数量五星的其他玩家相比,测试玩家花费的抽数大于等于比例

一棵平衡树 34 Nov 26, 2022
Programmatic interface to Synapse services for Python

A Python client for Sage Bionetworks' Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate

Sage Bionetworks 54 Dec 23, 2022
Notebooks for computing approximations to the prime counting function using Riemann's formula.

Notebooks for computing approximations to the prime counting function using Riemann's formula.

Tom White 2 Aug 02, 2022
Bible-App : Simple Tool To Show Bible Books

Bible App Simple Tool To Show Bible Books Socials: Language:

ميخائيل 5 Jan 18, 2022
"Hacking" the (Telekom) Zyxel GPON SFP module (PMG3000-D20B)

"Hacking" the (Telekom) Zyxel GPON SFP module (PMG3000-D20B) The SFP can be sour

Matthias Riegler 52 Jan 03, 2023
You'll learn about Iterators, Generators, Closure, Decorators, Property, and RegEx in detail with examples.

07_Python_Advanced_Topics Introduction 👋 In this tutorial, you will learn about: Python Iterators: They are objects that can be iterated upon. In thi

Milaan Parmar / Милан пармар / _米兰 帕尔马 252 Dec 23, 2022
Excel cell checker with python

excel-cell-checker Description This tool checks a given .xlsx file has the struc

Paul Aumann 1 Jan 04, 2022
Google Scholar App Using Python

Google Scholar App Watch the tutorial video How to build a Google Scholar App | Streamlit #30 Demo Launch the web app: Reproducing this web app To rec

Chanin Nantasenamat 4 Jun 05, 2022
GWCelery is a simple and reliable package for annotating and orchestrating LIGO/Virgo alerts

GWCelery is a simple and reliable package for annotating and orchestrating LIGO/Virgo alerts, built from widely used open source components.

Min-A Cho Zeno 1 Nov 02, 2021
Python-Course-V1 - This Repo contains a series of Python Jupyter Notebooks and assignments

This Repo contains a series of Python Jupyter Notebooks and assignments. The assignments are taken from Python Crash Course book by Eric Matthes.

2 Nov 15, 2022
A fancy and practical functional tools

Funcy A collection of fancy functional tools focused on practicality. Inspired by clojure, underscore and my own abstractions. Keep reading to get an

Alexander Schepanovski 2.9k Dec 29, 2022
Imports an object based on a string import_string('package.module:function_name')() - Based on werkzeug.utils

DEPRECATED don't use it. Please do: import importlib foopath = 'src.apis.foo.Foo' module_name = '.'.join(foopath.split('.')[:-1]) # to get src.apis.f

Bruno Rocha Archived Projects 11 Nov 12, 2022