Code for "On Memorization in Probabilistic Deep Generative Models"

Overview

On Memorization in Probabilistic Deep Generative Models

This repository contains the code necessary to reproduce the experiments in On Memorization in Probabilistic Deep Generative Models. You can also use this code to measure memorization in other types of probabilistic deep generative models. If you use our code in your own work please cite the paper using, for instance, the following BibTeX entry:

@article{van2021memorization,
  title={On Memorization in Probabilistic Deep Generative Models},
  author={{Van den Burg}, G. J. J. and Williams, C. K. I.},
  journal={arXiv preprint arXiv:2106.03216},
  year={2021}
}

If you have any questions or encounter an issue when using this code, please send an email to gertjanvandenburg at gmail dot com.

Introduction

The files in the scripts directory are needed to reproduce the experiments and generate the figures in the paper. The experiments are organized using the Makefile provided. To reproduce the experiments or recreate the figures from the analysis, you'll have to install a number of dependencies. We use PyTorch to implement the deep learning algorithms. If you don't wish to re-run all the models, you can download the result files used in the paper (see below).

The scripts are all written in Python, and the necessary external dependencies can be found in the requirements.txt file. These can be installed using:

$ pip install -r requirements.txt

To recreate the figures the following system dependencies are also needed: pdflatex, latexmk, lualatex, and make. These programs are available for all major platforms.

Reproducing the results

To train the models on the different data sets, you can run:

$ make memorization

Note that depending on your machine this may take some time, so it might be easier to simply download the result files instead. It is also worth mentioning that while we have made an effort to ensure reproducibility by setting the random seed in PyTorch, platform or package version differences may result in slightly different output files (see also PyTorch Reproducibility).

All figures in the paper are generated from the raw result files using Python scripts. First, the summarize.py script takes the raw result files and creates summary files for each data set. Next, the analysis scripts are used to generate the figures, most of which are LaTeX files that require compilation using PDFLaTeX or LuaLaTeX. Simply run:

$ make analysis

to create the summaries and the output files. When using the result files linked below this will give the exact same figures as shown in the paper.

Result files

Due to their size, the raw result files are not contained in this repository, but can be downloaded separately from this link (about 2.6GB). After downloading the results.zip file, unpack it and move the results directory to where you've cloned this repository (so adjacent to the scripts directory). Below is a concise overview of the necessary commands:

$ git clone https://github.com/alan-turing-institute/memorization
$ cd memorization
$ wget https://gertjanvandenburg.com/projects/memorization/results.zip # or download the file in some other way
$ unzip results.zip
$ touch results/*/*/*          # update modification time of the result files
$ make analysis                # optionally, run ``make -n analysis`` first to see what will happen

After unpacking the zip file, you can optionally verify the integrity of the results using the SHA-256 checksums provided:

$ sha256sum --check results.sha256

License

The code in this repository is licensed under the MIT license. See the LICENSE file for further details. Reuse of the code in this repository is allowed, but should cite our paper.

Notes

If you find any problems or have a suggestion for improvement of this repository, please let me know as it will help make this resource better for everyone.

Owner
The Alan Turing Institute
The UK's national institute for data science and artificial intelligence.
The Alan Turing Institute
Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci

Hassan Shahzad 2 Jan 17, 2022
Differentiable Optimizers with Perturbations in Pytorch

Differentiable Optimizers with Perturbations in PyTorch This contains a PyTorch implementation of Differentiable Optimizers with Perturbations in Tens

Jake Tuero 54 Jun 22, 2022
Differentiable rasterization applied to 3D model simplification tasks

nvdiffmodeling Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Automatic 3D Model

NVIDIA Research Projects 336 Dec 30, 2022
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

22 Dec 08, 2022
StyleTransfer - Open source style transfer project, based on VGG19

StyleTransfer - Open source style transfer project, based on VGG19

Patrick martins de lima 9 Dec 13, 2021
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN 🌟 This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

sarulab-speech 24 Sep 07, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 02, 2022
Run object detection model on the Raspberry Pi

Using TensorFlow Lite with Python is great for embedded devices based on Linux, such as Raspberry Pi.

Dimitri Yanovsky 6 Oct 08, 2022
DeepCAD: A Deep Generative Network for Computer-Aided Design Models

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Iman Mirzadeh 36 Dec 25, 2022
Python Interview Questions

Python Interview Questions Clone the code to your computer. You need to understand the code in main.py and modify the content in if __name__ =='__main

ClassmateLin 575 Dec 28, 2022
Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Rena

valeo.ai 15 Dec 22, 2022
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022
xitorch: differentiable scientific computing library

xitorch is a PyTorch-based library of differentiable functions and functionals that can be widely used in scientific computing applications as well as deep learning.

24 Apr 15, 2021
Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Det

123 Jan 04, 2023
Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

Sun Ran 1 May 18, 2022
How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Code for the paper: How Effective is Incongruity? Implications for Code-mix Sarcasm Detection - ICON ACL 2021

2 Jun 05, 2022