Universal Probability Distributions with Optimal Transport and Convex Optimization

Overview

Sylvester normalizing flows for variational inference

Pytorch implementation of Sylvester normalizing flows, based on our paper:

Sylvester normalizing flows for variational inference (UAI 2018)
Rianne van den Berg*, Leonard Hasenclever*, Jakub Tomczak, Max Welling

*Equal contribution

Requirements

The latest release of the code is compatible with:

  • pytorch 1.0.0

  • python 3.7

Thanks to Martin Engelcke for adapting the code to provide this compatibility.

Version v0.3.0_2.7 is compatible with:

  • pytorch 0.3.0 WARNING: More recent versions of pytorch have different default flags for the binary cross entropy loss module: nn.BCELoss(). You have to adapt the appropriate flags if you want to port this code to a later vers
    ion.

  • python 2.7

Data

The experiments can be run on the following datasets:

  • static MNIST: dataset is in data folder;
  • OMNIGLOT: the dataset can be downloaded from link;
  • Caltech 101 Silhouettes: the dataset can be downloaded from link.
  • Frey Faces: the dataset can be downloaded from link.

Usage

Below, example commands are given for running experiments on static MNIST with different types of Sylvester normalizing flows, for 4 flows:

Orthogonal Sylvester flows
This example uses a bottleneck of size 8 (Q has 8 columns containing orthonormal vectors).

python main_experiment.py -d mnist -nf 4 --flow orthogonal --num_ortho_vecs 8 

Householder Sylvester flows
This example uses 8 Householder reflections per orthogonal matrix Q.

python main_experiment.py -d mnist -nf 4 --flow householder --num_householder 8

Triangular Sylvester flows

python main_experiment.py -d mnist -nf 4 --flow triangular 

To run an experiment with other types of normalizing flows or just with a factorized Gaussian posterior, see below.


Factorized Gaussian posterior

python main_experiment.py -d mnist --flow no_flow

Planar flows

python main_experiment.py -d mnist -nf 4 --flow planar

Inverse Autoregressive flows
This examples uses MADEs with 320 hidden units.

python main_experiment.py -d mnist -nf 4 --flow iaf --made_h_size 320

More information about additional argument options can be found by running ```python main_experiment.py -h```

Cite

Please cite our paper if you use this code in your own work:

@inproceedings{vdberg2018sylvester,
  title={Sylvester normalizing flows for variational inference},
  author={van den Berg, Rianne and Hasenclever, Leonard and Tomczak, Jakub and Welling, Max},
  booktitle={proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI)},
  year={2018}
}
Comments
  • about log_p_zk

    about log_p_zk

    Hi Rianne, This is a great code, and I have a little question about logp(zk), we hope p(zk) in VAE can be a distribution whose form is no fixed, but it seems that the calculate of logp(zk) in line81 of loss.py imply that p(zk) is a standard Gaussion. Are there some mistakes about my understanding?
    Thank your for this code

    opened by Archer666 10
  • loss = bce + beta * kl

    loss = bce + beta * kl

    hello Rianne: Thanks very much. I am a bit confused with line 44 in loss.py : loss = bce + beta * kl. Based on equation 3 in Tomczak's paper (Improving Variational Auto-Encoder Using Householder Flows), shouldn't "loss = bce - beta * kl "? Also, why use -ELBO instead of ELBO when reporting your metrics? Thanks

    opened by tumis1946 4
  • PyTorch_v1 and Python3 compatibility

    PyTorch_v1 and Python3 compatibility

    Hi Rianne,

    This PR contains a 'minimal' set of changes to run the code with the latest PyTorch versions and Python 3 ( #1 #2 )

    It is 'minimal' in the sense that I only made changes that affect functionality. There are additional cosmetic changes that could be made; e.g. Variable(), the volatile flag, and F.sigmoid() have been deprecated but they should not affect functionality.

    I tested the changes with PyTorch 1.0.0 and Python 3.7 on MNIST and Freyfaces, giving me similar results for the baseline VAE without any flows.

    I am not sure if more rigorous test should be done and if you want to merge this into master or keep a separate branch.

    Best, Martin

    opened by martinengelcke 1
  • PR for PyTorch 1.+ and Python 3 support

    PR for PyTorch 1.+ and Python 3 support

    Hi Rianne,

    Thank you for this really nice code release :)

    I cloned the repo and made some changes so that it runs with PyTorch 1.+ and Python 3. Also solved the issue mentioned in #1 . I tested the changes on MNIST (binary input) and Freyfaces (multinomial input), giving similar results to the original code.

    If you are interested in reviewing and potentially adding this to the repo, I would be happy to clean things up and make a PR.

    Best, Martin

    opened by martinengelcke 1
  • RuntimeError in default main experiment

    RuntimeError in default main experiment

    Hi Rianne,

    I'm trying to run the default experiment on cpu with a small latent space dimension (z=5):

    python main_experiment.py -d mnist --flow no_flow -nc --z_size 5

    Which unfortunately gives the following error:

    Traceback (most recent call last):
      File "main_experiment.py", line 278, in <module>
        run(args, kwargs)
      File "main_experiment.py", line 189, in run
        tr_loss = train(epoch, train_loader, model, optimizer, args)
      File ".../sylvester-flows/optimization/training.py", line 39, in train
        loss.backward()
      File "//anaconda/envs/dl/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph)
      File "//anaconda/envs/dl/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
        allow_unreachable=True)  # allow_unreachable flag
    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
    

    I am using PyTorch version 1.0.0 and did not modify the code.

    opened by trdavidson 1
  • How to sample from latent distribution

    How to sample from latent distribution

    Hello,

    I was wondering how I can generate samples using the decoder network after training. In a VAE, I would just sample from the prior distribution z~N(0,1) and generate a data point using the decoder. In TriangularSylvesterVAE, however, I also have to provide hyperparameters lambda(x) that depend on the input. How can I sample from my latent distribution and generate samples from it?

    I am new to normalizing flows in general and would appreciate any help.

    opened by crlz182 2
Releases(v1.0.0_3.7)
  • v1.0.0_3.7(Jul 5, 2019)

    Sylvester Normalizing Flow repository compatible with Pytorch 1.0.0 and Python 3.7. Thanks to martinengelcke for taking care of this compatibility.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0_2.7(Jul 5, 2019)

Owner
Rianne van den Berg
Senior researcher @Microsoft research Amsterdam. Formerly at Google Brain and University of Amsterdam
Rianne van den Berg
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
Neural style in TensorFlow! 🎨

neural-style An implementation of neural style in TensorFlow. This implementation is a lot simpler than a lot of the other ones out there, thanks to T

Anish Athalye 5.5k Dec 29, 2022
Pose estimation for iOS and android using TensorFlow 2.0

💃 Mobile 2D Single Person (Or Your Own Object) Pose Estimation for TensorFlow 2.0 This repository is forked from edvardHua/PoseEstimationForMobile wh

tucan9389 165 Nov 16, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
Udacity's CS101: Intro to Computer Science - Building a Search Engine

Udacity's CS101: Intro to Computer Science - Building a Search Engine All soluti

Phillip 0 Feb 26, 2022
[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo Lukas Koestler1*    Nan Yang1,2*,†    Niclas Zeller2,3    Daniel Cremers1

TUM Computer Vision Group 744 Jan 04, 2023
Convex optimization for fun and profit.

CFMM Optimal Routing This repository contains the code needed to generate the figures used in the paper Optimal Routing for Constant Function Market M

Guillermo Angeris 183 Dec 29, 2022
The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

Simple-DMA a simple Dual Memory Architecture for classifications. based on the paper Dual-Memory Deep Learning Architectures for Lifelong Learning of

1 Jan 27, 2022
Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

64 Dec 20, 2022
a Lightweight library for sequential learning agents, including reinforcement learning

SaLinA: SaLinA - A Flexible and Simple Library for Learning Sequential Agents (including Reinforcement Learning) TL;DR salina is a lightweight library

Facebook Research 405 Dec 17, 2022
Certis - Certis, A High-Quality Backtesting Engine

Certis - Backtesting For y'all Certis is a powerful, lightweight, simple backtes

Yeachan-Heo 46 Oct 30, 2022
Parameterized Explainer for Graph Neural Network

PGExplainer This is a Tensorflow implementation of the paper: Parameterized Explainer for Graph Neural Network https://arxiv.org/abs/2011.04573 NeurIP

Dongsheng Luo 89 Dec 12, 2022
Deep Learning for Time Series Classification

Deep Learning for Time Series Classification This is the companion repository for our paper titled "Deep learning for time series classification: a re

Hassan ISMAIL FAWAZ 1.2k Jan 02, 2023
Code for "Learning Graph Cellular Automata"

Learning Graph Cellular Automata This code implements the experiments from the NeurIPS 2021 paper: "Learning Graph Cellular Automata" Daniele Grattaro

Daniele Grattarola 37 Oct 26, 2022
Codes for building and training the neural network model described in Domain-informed neural networks for interaction localization within astroparticle experiments.

Domain-informed Neural Networks Codes for building and training the neural network model described in Domain-informed neural networks for interaction

DIDACTS 0 Dec 13, 2021
Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiati

8 Aug 28, 2022
Visual Adversarial Imitation Learning using Variational Models (VMAIL)

Visual Adversarial Imitation Learning using Variational Models (VMAIL) This is the official implementation of the NeurIPS 2021 paper. Project website

14 Nov 18, 2022
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

348 Jan 07, 2023
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)

Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19) Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Ry

Steve Wong 177 Dec 01, 2022
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022