Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Overview

data-preprocessing_toogoodtogo_threatlines

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz, Stef van Buuren, and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Synchronous visualisation of email threads

Publications from the FTM "Dossier SHELL papers" https://www.ftm.nl/dossier/shell-papers suggest that timing of events is critical in the interactions between actors. It would therefore be useful if we could visualise the mail exchanges in time.

The idea is to visualise threads of mail exchanges between actors over time. When this is done for multiple threads, the display would give rapid insight into the structure and timing of exchanges between actors. For example, suppose we are able to construct a single thread from "RE:" and "FW:" mails in the data. A simple visualisation would be

See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9825&rep=rep1&type=pdf for variations on this display, for example by adding the interactions between the actors by fancy arcs and resorting the mails according to actor pairs.

A generalisation to multiple simulataneous threads would stack multiple lines, similar to a dot plot. Such a design calls for relatively simple thread displays that are synchronised in time. Therefore we will concentrate on using a simple thread line that plots mail chronology against calender time.

A somewhat grander idea would be to create a "film of events". The user would place a cursor on the time axis, and scroll through time. The new information per mail is displayed as the cursor passes the send time of the email.

Issues to resolve

We need complex/advanced text processing. Some of the issues include:

  1. How can we split multiple emails in a RE/FW into a set of elementary mails, each corresponding to just one sender?
  2. How well can we form threads by matching on subject lines?
  3. Do duplicates extracted from RE/FW serve any useful purpose?
  4. What is the percentage of threads for which we can find the parent mail (the mail that started the thread)?

Experiment 1

The first design plots all thread lines between 2016 and 2020 on one chart.

Experiment 2

The second design uses trelliscopejs to plot the same information in smaller pieces.

The user can switch between 27 panes, each containing about 20 threads.

Try out the interactive version

Experiment 3

Back to figure 1, but now plotted with rbokeh, so that we may zoom and use tooltips (interaction not supported by GitHub markdown)

Owner
ASReview hackathon for Follow the Money
ASReview hackathon for Follow the Money
Wagtail + Lottie is a Wagtail package for playing Adobe After Effects animations exported as json with Bodymovin.

Wagtail Lottie Wagtail + Lottie is a Wagtail package for playing Adobe After Effects animations exported as json with Bodymovin. Usage Export your ani

Alexis Le Baron 7 Aug 18, 2022
A reproduction repo for a Scheduling bug in AirFlow 2.2.3

A reproduction repo for a Scheduling bug in AirFlow 2.2.3

Ilya Strelnikov 1 Feb 09, 2022
On this repo, you'll find every codes I made during my NSI classes (informatical courses)

👨‍💻 👩‍💻 school-codes On this repo, you'll find every codes I made during my NSI classes (informatical courses) French for now since this repo is d

EDM 1.15 3 Dec 17, 2022
A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

Marshall Conover 2 Jan 12, 2022
Cairo hooks for pre-commit

pre-commit-cairo Cairo hooks for pre-commit. See pre-commit for more details Using pre-commit-cairo with pre-commit Add this to your .pre-commit-confi

Fran Algaba 16 Sep 21, 2022
Utility functions for working with data from Nix in Python

Pynixutil - Utility functions for working with data from Nix in Python Examples Base32 encoding/decoding import pynixutil input = "v5sv61sszx301i0x6x

Tweag 11 Dec 16, 2022
ChronoRace is a tool to accurately perform timed race conditions to circumvent application business logic.

ChronoRace is a tool to accurately perform timed race conditions to circumvent application business logic. I've found in my research that w

Tanner 64 Aug 04, 2022
Goal: Enable awesome tooling for Bazel users of the C language family.

Hedron's Compile Commands Extractor for Bazel — User Interface What is this project trying to do for me? First, provide Bazel users cross-platform aut

Hedron Vision 290 Dec 26, 2022
Just some information about this nerd.

Greetings, mates, I am ErrorDIM - aka ErrorDimension 👋 🧬 Programming Languages I Can Use: 🥇 Top Starred Repositories: # Name Stars Size Major Langu

ErrorDIM 3 Jan 11, 2022
Here is my Senior Design Project that I implemented to graduate from Computer Engineering.

Here is my Senior Design Project that I implemented to graduate from Computer Engineering. It is a chatbot made in RASA and helps the user to plan their vacation in the Turkish language. In order to

Ezgi Subaşı 25 May 31, 2022
Code repo for the book "Feature Engineering for Machine Learning," by Alice Zheng and Amanda Casari, O'Reilly 2018

feature-engineering-book This repo accompanies "Feature Engineering for Machine Learning," by Alice Zheng and Amanda Casari. O'Reilly, 2018. The repo

Alice Zheng 1.3k Dec 30, 2022
This is a Python program I wrote to simulate the solar system with 79 lines of code.

Solar System With Python This is a Python program I wrote to simulate the solar system with 79 lines of code. Required modules tkinter, math, time Why

Mehmet Aydoğmuş 1 Oct 26, 2021
Python version of RocketLeague-Dropshot-Calculated-shot

Python version of RocketLeague-Dropshot-Calculated-shot. This is just to demo around and a tool I used to develop the actual plugin.

JareBear 1 Jan 14, 2022
Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

Dot HQ 1k Jan 07, 2023
Really bad lisp implementation. Fun with pattern matching.

Lisp-py This is a horrible, ugly interpreter for a trivial lisp. Don't use it. It was written as an excuse to mess around with the new pattern matchin

Erik Derohanian 1 Nov 23, 2021
NUM Alert - A work focus aid created for the Hack the Job hackathon

Contributors: Uladzislau Kaparykha, Amanda Hahn, Nicholas Waller Hackathon Team Name: N.U.M General Purpose: The general purpose of this program is to

Amanda Hahn 1 Jan 10, 2022
Pardus-flatpak-gui - A Flatpak GUI for Pardus

Pardus Flatpak GUI A GUI for Flatpak. You can run, install (from FlatHub and fro

Erdem Ersoy 2 Feb 17, 2022
A simple calculator that can add, subtract, multiply or divide depending upon the input from the user

Calculator A simple calculator that can add, subtract, multiply or divide depending upon the input from the user. In this example, we should have the

Jayesh Mali 1 Dec 27, 2021
Convex Optimisation MVA course - Assignment

Convex Optimisation MVA course - Assignment This repository contains the coding files of the third assignment in the MVA Convex Optimisation course. U

1 Nov 27, 2021
A Trace Explorer for Reverse Engineers

Tenet - A Trace Explorer for Reverse Engineers Overview Tenet is an IDA Pro plugin for exploring execution traces. The goal of this plugin is to provi

1k Jan 02, 2023