Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Overview

data-preprocessing_toogoodtogo_threatlines

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz, Stef van Buuren, and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Synchronous visualisation of email threads

Publications from the FTM "Dossier SHELL papers" https://www.ftm.nl/dossier/shell-papers suggest that timing of events is critical in the interactions between actors. It would therefore be useful if we could visualise the mail exchanges in time.

The idea is to visualise threads of mail exchanges between actors over time. When this is done for multiple threads, the display would give rapid insight into the structure and timing of exchanges between actors. For example, suppose we are able to construct a single thread from "RE:" and "FW:" mails in the data. A simple visualisation would be

See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9825&rep=rep1&type=pdf for variations on this display, for example by adding the interactions between the actors by fancy arcs and resorting the mails according to actor pairs.

A generalisation to multiple simulataneous threads would stack multiple lines, similar to a dot plot. Such a design calls for relatively simple thread displays that are synchronised in time. Therefore we will concentrate on using a simple thread line that plots mail chronology against calender time.

A somewhat grander idea would be to create a "film of events". The user would place a cursor on the time axis, and scroll through time. The new information per mail is displayed as the cursor passes the send time of the email.

Issues to resolve

We need complex/advanced text processing. Some of the issues include:

  1. How can we split multiple emails in a RE/FW into a set of elementary mails, each corresponding to just one sender?
  2. How well can we form threads by matching on subject lines?
  3. Do duplicates extracted from RE/FW serve any useful purpose?
  4. What is the percentage of threads for which we can find the parent mail (the mail that started the thread)?

Experiment 1

The first design plots all thread lines between 2016 and 2020 on one chart.

Experiment 2

The second design uses trelliscopejs to plot the same information in smaller pieces.

The user can switch between 27 panes, each containing about 20 threads.

Try out the interactive version

Experiment 3

Back to figure 1, but now plotted with rbokeh, so that we may zoom and use tooltips (interaction not supported by GitHub markdown)

Owner
ASReview hackathon for Follow the Money
ASReview hackathon for Follow the Money
Bad Apple printed out on the console with Python!

bad-apple Bad Apple printed out on the console with Python! Preface A word of disclaimer, while the final code is somewhat original, this project is a

CalvinLoke 186 Dec 01, 2022
Perform oocyst segmentation in mercurochrome stained mosquito midgut

Midgut_oocyst_segmentation Perform oocyst segmentation in mercurochrome stained mosquito midguts This oocyst segmentation model also powers the webtoo

Duo Peng 3 Oct 27, 2021
Python library and cli util for https://www.zerochan.net/

Zerochan Library for Zerochan.net with pics parsing and downloader included! Features CLI utility for pics downloading from zerochan.net Library for c

kiriharu 10 Oct 11, 2022
A basic animation modding workflow for FFXIV

AnimAssist Provides a quick and easy way to mod animations in FFXIV. You will need: Before anything, the VC++2012 32-bit Redist from here. Havok will

liam 37 Dec 16, 2022
Run Python code right in your Telegram messages

Run Python code right in your Telegram messages Made with Telethon library, TGPy is a tool for evaluating expressions and Telegram API scripts. Instal

29 Nov 22, 2022
An advanced pencil sketch generator

Pencilate An advanced pencil sketch generator About : An advanced pencil sketch maker made in just 12 lines of code. Yes you read it right, JUST 12 LI

MAINAK CHAUDHURI 23 Dec 17, 2022
Github dorking tool

gh-dork Supply a list of dorks and, optionally, one of the following: a user (-u) a file with a list of users (-uf) an organization (-org) a file with

Molly White 119 Dec 21, 2022
Tindicators is a Python library to calculate the values of various technical indicators

Tindicators is a Python library to calculate the values of various technical indicators

omar 3 Mar 03, 2022
Fluxos de captura e subida de dados no datalake da Prefeitura do Rio de Janeiro.

Pipelines Este repositório contém fluxos de captura e subida de dados no datalake da Prefeitura do Rio de Janeiro. O repositório é gerido pelo Escritó

Prefeitura do Rio de Janeiro 19 Dec 15, 2022
Find all social media accounts with a username!

Aliens_eye FIND ALL SOCIAL MEDIA ACCOUNTS WITH A USERNAME! OSINT To install: Open terminal and type: git clone https://github.com/BLINKING-IDIOT/Alien

Aaron Thomas 84 Dec 28, 2022
Tools Elit Adalah Sebuah Script Crack Yang Wajib Tap Yes...

Tools Elit Adalah Sebuah Script Crack Yang Wajib Tap Yes...

Risky [ Zero Tow ] 10 Apr 07, 2022
This is a multi-app executor that it used when we have some different task in a our applications and want to run them at the same time

This is a multi-app executor that it used when we have some different task in a our applications and want to run them at the same time. It uses SQLAlchemy for ORM and Alembic for database migrations.

Majid Iranpour 5 Apr 16, 2022
Python Freecell Solver

freecell Python Freecell Solver Very early version right now. You can pick a board by changing the file path in freecell.py If you want to play a game

Ben Kaufman 1 Nov 26, 2021
Uma moeda simples e segura!

SecCoin - Documentação A SecCoin foi criada com intuito de ser uma moeda segura, de fácil investimento e mineração. A Criptomoeda está na sua primeira

Sec-Coin Team 5 Dec 09, 2022
Introduction to Databases Coursework 2 (SQL) - dataset generator

Introduction to Databases Coursework 2 (SQL) - dataset generator This is python script generates a text file with insert queries for the schema.sql fi

Javier Bosch 1 Nov 08, 2021
Distribute PySPI jobs across a PBS cluster

Distribute PySPI jobs across a PBS cluster This repository contains scripts for distributing PySPI jobs across a PBS-type cluster. Each job will conta

Oliver Cliff 1 Feb 10, 2022
An implementation of Ray Tracing in One Weekend using Taichi

又一个Taichi语言的Ray Tracer 背景简介 这个Ray Tracer基本上是照搬了Peter Shirley的第一本小书Ray Tracing in One Weekend,在我写的时候参考的是Version 3.2.3这个版本。应该比其他中文博客删改了不少内容。果然Peter Shir

张皓 30 Nov 21, 2022
Personal Chat Assistance

Python-Programming Personal Chat Assistance {% import "bootstrap/wtf.html" as wtf %} titleEVT/title script src="https://code.jquery.com/jquery-3.

PRASH_SMVIT 2 Nov 14, 2021
Lenovo Yoga Ideapad Autocharge

Description This program uses the conservation_mode of Lonovo Ideapad / Yoga not

1 Jan 09, 2022
fast_bss_eval is a fast implementation of the bss_eval metrics for the evaluation of blind source separation.

fast_bss_eval Do you have a zillion BSS audio files to process and it is taking days ? Is your simulation never ending ? Fear no more! fast_bss_eval i

Robin Scheibler 99 Dec 13, 2022