Continual Learning of Electronic Health Records (EHR).

Overview

arXiv License: MIT

Continual Learning of Longitudinal Health Records

Repo for reproducing the experiments in Continual Learning of Longitudinal Health Records (2021). Release v0.1 of the project corresponds to published results.

Experiments evaluate various continual learning strategies on standard ICU predictive tasks exhibiting covariate shift. Task outcomes are binary, and input data are multi-modal time-series from patient ICU admissions.

Setup

  1. Clone this repo to your local machine.
  2. Request access to MIMIC-III and eICU-CRD.1
  3. Download the preprocessed datasets to the /data subfolder.
  4. (Recommended) Create and activate a new virtual environment:
    python3 -m venv .venv --upgrade-deps
  5. Install dependencies:
    pip install -U wheel buildtools
    pip install -r requirements.txt

Results

To reproduce main results:

python3 main.py --train

Figures will be saved to /results/figs. Instructions to reproduce supplementary experiments can be found here. Bespoke experiments can be specified with appropriate flags e.g:

python3 main.py --domain_shift hospital --outcome mortality_48h --models CNN --strategies EWC Replay --validate --train

A complete list of available options can be found here or with python3 main.py --help.

Citation

If you use any of this code in your work, please reference us:

@misc{armstrong2021continual,
      title={Continual learning of longitudinal health records}, 
      author={J. Armstrong and D. Clifton},
      year={2021},
      eprint={2112.11944},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Python versions

Notes

Note that Temporal Domain Incremental learning experiments require linkage with original MIMIC-III dataset. Requires downloading ADMISSIONS.csv from MIMIC-III to the /data/mimic3/ folder.

Stack

For standardisation of ICU predictive task definitions, feature pre-processing, and Continual Learning method implementations, we use the following tools:

Tool Source
ICU Data MIMIC-III
eICU-CRD
Data preprocessing / task definition FIDDLE
Continual Learning strategies Avalanche
Comments
  • Change experience to class balanced replay

    Change experience to class balanced replay

    Have manually edited the replay definition for now. Will need to update avalanche and do change based on training.storage_policy.

    May also need to change memory buffer to n_tasks * buffer (since GEM etc use this number for experience-wise buffer sizes).

    opened by iacobo 1
  • Bump numpy from 1.20.3 to 1.22.0

    Bump numpy from 1.20.3 to 1.22.0

    Bumps numpy from 1.20.3 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Add Naive with no regularization?

    Add Naive with no regularization?

    Maybe add naive with no regularization? I.e. no dropout etc, to enable clearer ablation testing of naive fine tuning and inherent regularization mechanisms vs explicit CL strategy.

    opened by iacobo 0
  • CNN fails with kernel_size 5 or 7

    CNN fails with kernel_size 5 or 7

    Getting the following error (on GPU) with CNN runs with kernel_size in [5,7]:

    RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
    

    https://stackoverflow.com/questions/66600362/runtimeerror-cuda-error-cublas-status-execution-failed-when-calling-cublassge?answertab=votes#tab-top

    opened by iacobo 0
  • Add early stopping to avoid over-large number of epochs for diff models

    Add early stopping to avoid over-large number of epochs for diff models

    MLP / LSTM take shorter time to train than CNN / Transformer. Add early stopping to avoid overtraining, saturating.

    Change strategy to base strategy inheriting from strat and earlystopping plugin.

    opened by iacobo 0
  • Correct code for ROC AUC and AUPRC

    Correct code for ROC AUC and AUPRC

    Cannot average metrics over minibatches as is done for other metrics, since they depend on threshold. Need to calculate over all. Check e.g. MeanScore for inspiration on metric definition.

    opened by iacobo 0
  • Need to add code for further experiments

    Need to add code for further experiments

    plotting.plot_demographics()
    
    # Secondary experiments:
    ########################
    # Sensitivity to sequence length (4hr vs 12hr)
    # Sensitivity to replay size Naive -> replay -> Cumulative
    # Sensitivity to hyperparams of reg methods (Tune hyperparams over increasing number of tasks?)
    # Sensitivity to number of variables (full vs Vitals only e.g.)
    # Sensitivity to size of domains - e.g. white ethnicity much larger than all other groups, affect of order of sequence
    
    opened by iacobo 1
  • Ray Tune warnings

    Ray Tune warnings

    Ray Tune produces the following warnings:

    INFO registry.py:66 -- Detected unknown callable for trainable. Converting to class.
    WARNING experiment.py:295 -- No name detected on trainable. Using DEFAULT.
    

    Non-fatal, but it's annoying to have these messages bloating the console output.

    raytune 
    opened by iacobo 2
Releases(v0.1)
Owner
Jacob
Data Scientist @publichealthengland
Jacob
Python Multi-Agent Reinforcement Learning framework

- Please pay attention to the version of SC2 you are using for your experiments. - Performance is *not* always comparable between versions. - The re

whirl 1.3k Jan 05, 2023
Python scripts for performing stereo depth estimation using the MobileStereoNet model in Tensorflow Lite.

TFLite-MobileStereoNet Python scripts for performing stereo depth estimation using the MobileStereoNet model in Tensorflow Lite. Stereo depth estimati

Ibai Gorordo 4 Feb 14, 2022
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
Efficient face emotion recognition in photos and videos

This repository contains code of face emotion recognition that was developed in the RSF (Russian Science Foundation) project no. 20-71-10010 (Efficien

Andrey Savchenko 239 Jan 04, 2023
Simple reimplemetation experiments about FcaNet

FcaNet-CIFAR An implementation of the paper FcaNet: Frequency Channel Attention Networks on CIFAR10/CIFAR100 dataset. how to run Code: python Cifar.py

76 Feb 04, 2021
A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

Zain 1 Feb 01, 2022
Official repository for Fourier model that can generate periodic signals

Conditional Generation of Periodic Signals with Fourier-Based Decoder Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi This repository provides offi

8 May 25, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance [Video Demo] [Paper] Installation Requirements Python 3.6 PyTorch 1.1.0 Pleas

Jiachen Xu 19 Oct 28, 2022
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Sequence Feature Alignment (SFA) By Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-jun Zha, Yonggang Wen, and Dacheng Tao This repository is an o

WangWen 79 Dec 24, 2022
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

93 Nov 08, 2022
This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

package tests docs license stats support This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML

National Center for Cognitive Research of ITMO University 482 Dec 26, 2022
This is the pytorch implementation of the paper - Axiomatic Attribution for Deep Networks.

Integrated Gradients This is the pytorch implementation of "Axiomatic Attribution for Deep Networks". The original tensorflow version could be found h

Tianhong Dai 150 Dec 23, 2022
Causal Imitative Model for Autonomous Driving

Causal Imitative Model for Autonomous Driving Mohammad Reza Samsami, Mohammadhossein Bahari, Saber Salehkaleybar, Alexandre Alahi. arXiv 2021. [Projec

VITA lab at EPFL 8 Oct 04, 2022
Streaming over lightweight data transformations

Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a

Research Unit of Medical Imaging, Physics and Technology 256 Jan 08, 2023
transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2:ImageNet无限制对抗攻击 介绍 : 深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

25 Dec 08, 2022
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 76 Nov 23, 2022
The Power of Scale for Parameter-Efficient Prompt Tuning

The Power of Scale for Parameter-Efficient Prompt Tuning Implementation of soft embeddings from https://arxiv.org/abs/2104.08691v1 using Pytorch and H

Kip Parker 208 Dec 30, 2022
Microscopy Image Cytometry Toolkit

Cytokit Cytokit is a collection of tools for quantifying and analyzing properties of individual cells in large fluorescent microscopy datasets with a

Hammer Lab 106 Jan 06, 2023
(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback About This repository accompanies the real-world experiments conducted i

yuta-saito 19 Dec 01, 2022