This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness

Overview

HealthGen: Conditional EHR Time Series Generation

Code Coverage

This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness.

Installation

  1. Clone the repo with: git clone --recurse-submodules [email protected]:simonbing/HealthGen.git.

  2. Navigate to the /healthgen directory and install the dependencies by running: pip install requirements.txt.

  3. Add the HealthGen module to your PYTHONPATH by running export PYTHONPATH=$PYTHONPATH:/path/to/HealthGen/healthgen.

  4. Optionally, setup wandb, a useful tool for experiment tracking, which is integrated into our pipeline. After setting up a free account, add your credentials and the desired project name for the placeholders wandb_user and wandb_project in the code.

Data Access

We utilize the MIMIC-III data set for the training and evaluation of our generative model, which is publicly available to credentialed users.

To extract an intermediate representation of the EHR time series data, we utilize a slightly modified version of MIMIC-Extract, which is automatically cloned if you followed the instructions for installation. To extract the intermediate tables of the data required for our pipeline, follow the steps 1-4 in the instructions of MIMIC-Extract. In addition to the standard flags, you can set the sampling frequency (e.g. to 15 minutes) by calling: python mimic_direct_extract.py --time_step 15 ...

After the extraction has finished (extracting all patients can take several hours on a machine with around 50 GB of memory), you should obtain four tables with the extracted patient data. This is the input data for our experimental pipeline.

Use

The main components of the pipeline can be run independently: data querying and processing from the database, training a generative model, and evaluation.

To run the entire experimental pipeline, i.e. extract the time series from the intermediate tables, train a generative model and run the resulting evaluation, run:

main.py 
--input_vitals /path/to/vitals/table 
--input_outcomes /path/to/outcomes/table
--input_static /path/to/static/table
--gen_model healthgen
--evaluation grud
--out_path /path/to/save/results

For more information on all available flags, run main.py --helpfull, and see the comments in the code for additional information.

License

MIT License

Authors

Simon Bing, Andrea Dittadi, Stefan Bauer, Patrick Schwab

DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
Pytorch Implementation of Various Point Transformers

Pytorch Implementation of Various Point Transformers Recently, various methods applied transformers to point clouds: PCT: Point Cloud Transformer (Men

Neil You 434 Dec 30, 2022
4th place solution to datafactory challenge by Intermarché.

Solution to Datafactory challenge by Intermarché. 4th place solution to datafactory challenge by Intermarché. The objective of the challenge is to pre

Raphael Sourty 11 Mar 19, 2022
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training Original implementation for paper GCC: Graph Contrastive Coding for Graph Neural N

THUDM 274 Dec 27, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

212 Dec 25, 2022
Some bravo or inspiring research works on the topic of curriculum learning.

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

131 Jan 07, 2023
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

187 Dec 26, 2022
Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport

Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport This GitHub page provides code for reproducing the results i

Andrew Zammit Mangion 1 Nov 08, 2021
PyTorch implementations of algorithms for density estimation

pytorch-flows A PyTorch implementations of Masked Autoregressive Flow and some other invertible transformations from Glow: Generative Flow with Invert

Ilya Kostrikov 546 Dec 05, 2022
Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation Introduction ACoSP is an online pruning algorithm that compr

Merantix 8 Dec 07, 2022
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023
Modified prey-predator system - Modified prey–predator model describes the rate of change for each species by adding coupling terms.

Modified prey-predator system We aim to study the behaviors of the modified prey–predator model and establish the effects of several parameters that p

Seoyoung Oh 1 Jan 02, 2022
This repository for project that can Automate Number Plate Recognition (ANPR) in Morocco Licensed Vehicles. 💻 + 🚙 + 🇲🇦 = 🤖 🕵🏻‍♂️

MoroccoAI Data Challenge (Edition #001) This Reposotory is result of our work in the comepetiton organized by MoroccoAI in the context of the first Mo

SAFOINE EL KHABICH 14 Oct 31, 2022
Ian Covert 130 Jan 01, 2023
这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer

Time Series Research with Torch 这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer。 建立原因 相较于mxnet和TF,Torch框架中的神经网络层需要提前指定输入维度: # 建立线性层 TensorF

Chi Zhang 85 Dec 29, 2022
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Kang Liao 18 Dec 23, 2022
Structured Data Gradient Pruning (SDGP)

Structured Data Gradient Pruning (SDGP) Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by re

Bradley McDanel 10 Nov 11, 2022
Easy to use Python camera interface for NVIDIA Jetson

JetCam JetCam is an easy to use Python camera interface for NVIDIA Jetson. Works with various USB and CSI cameras using Jetson's Accelerated GStreamer

NVIDIA AI IOT 358 Jan 02, 2023