Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Overview

Bayesian-Torch: Bayesian neural network layers for uncertainty estimation

Get started | Example usage | Documentation | License | Citing

Bayesian layers and utilities to perform stochastic variational inference in PyTorch

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

The repository has implementations for the following Bayesian layers:

  • Variational layers with Gaussian mixture model (GMM) posteriors using reparameterized Monte Carlo estimators (in pre-alpha)

    LinearMixture
    Conv1dMixture, Conv2dMixture, Conv3dMixture, ConvTranspose1dMixture, ConvTranspose2dMixture, ConvTranspose3dMixture
    LSTMMixture
    

Please refer to documentation of Bayesian layers for details.

Other features include:

Installation

Install from source:

git clone https://github.com/IntelLabs/bayesian-torch
cd bayesian-torch
pip install .

This code has been tested on PyTorch v1.6.0 and torchvision v0.7.0 with python 3.7.7.

Dependencies:

  • Create conda environment with python=3.7
  • Install PyTorch and torchvision packages within conda environment following instructions from PyTorch install guide
  • conda install -c conda-forge accimage
  • pip install tensorboard
  • pip install scikit-learn

Example usage

We have provided example model implementations using the Bayesian layers.

We also provide example usages and scripts to train/evaluate the models. The instructions for CIFAR10 examples is provided below, similar scripts for ImageNet and MNIST are available.

cd bayesian_torch

Training

To train Bayesian ResNet on CIFAR10, run this command:

Mean-field variational inference (Reparameterized Monte Carlo estimator)

sh scripts/train_bayesian_cifar.sh

Mean-field variational inference (Flipout Monte Carlo estimator)

sh scripts/train_bayesian_flipout_cifar.sh

To train deterministic ResNet on CIFAR10, run this command:

Vanilla

sh scripts/train_deterministic_cifar.sh

Evaluation

To evaluate Bayesian ResNet on CIFAR10, run this command:

Mean-field variational inference (Reparameterized Monte Carlo estimator)

sh scripts/test_bayesian_cifar.sh

Mean-field variational inference (Flipout Monte Carlo estimator)

sh scripts/test_bayesian_flipout_cifar.sh

To evaluate deterministic ResNet on CIFAR10, run this command:

Vanilla

sh scripts/test_deterministic_cifar.sh

Citing

If you use this code, please cite as:

@misc{krishnan2020bayesiantorch,
    author = {Ranganath Krishnan and Piero Esposito},
    title = {Bayesian-Torch: Bayesian neural network layers for uncertainty estimation},
    year = {2020},
    publisher = {GitHub},
    howpublished = {\url{https://github.com/IntelLabs/bayesian-torch}}
}

Cite the weight sampling methods as well: Blundell et al. 2015; Wen et al. 2018

Contributors

  • Ranganath Krishnan
  • Piero Esposito

This code is intended for researchers and developers, enables to quantify principled uncertainty estimates from deep neural network predictions using stochastic variational inference in Bayesian neural networks. Feedbacks, issues and contributions are welcome. Email to [email protected] for any questions.

Comments
  • The average should be taken over log probability rather than logits

    The average should be taken over log probability rather than logits

    https://github.com/IntelLabs/bayesian-torch/blob/7abcfe7ff3811c6a5be6326ab91a8d5cb1e8619f/bayesian_torch/examples/main_bayesian_cifar.py#L363-L367 I think the average across the MC runs should be taken over the log probability. However, the output here is the logits before the softmax operation. I think we may first run output = F.log_softmax(output, dim=1) and then take the average.

    There are two equivalent ways to take the average, which I think is more reasonable. The first way is

    for mc_run in range(args.num_mc):
        output, kl = model(input_var)
        output = F.log_softmax(output, dim=1)
        output_.append(output)
        kl_.append(kl)
    output = torch.mean(torch.stack(output_), dim=0)
    loss= F.nll_loss(output, target_var) # this is to replace the original cross_entropy_loss
    

    Or equivalently, we can first take the cross-entropy loss for each MC run, and average the losses at the end:

    loss = 0
    for mc_run in range(args.num_mc):
        output, kl = model(input_var)
        loss = loss + F.cross_entropy(output, target_var, dim=1)
        kl_.append(kl)
    loss = loss / args.num_mc  # this is to replace the original cross_entropy_loss
    
    question 
    opened by Nebularaid2000 5
  • KL divergence not changing during training

    KL divergence not changing during training

    Hello,

    I am trying to make a single layer BNN using the LinearReparameterization layer. I am unable to get it to give reasonable uncertainty estimates, so I started monitoring the KL term from the layers and noticed that it is not changing at all for each epoch. Even when I scale up the KL term in the loss, it remains unchanged.

    I am not sure if this is a bug, or if I am not doing the training correctly.

    My model

    class BNN(nn.Module):
        def __init__(self, input_dim, hidden_dim, output_dim=1):
            super().__init__()
            self.layer1 = LinearReparameterization(input_dim, hidden_dim)
            self.layerf = nn.Linear(hidden_dim, output_dim)
            
        def forward(self, x):
            kl_sum = 0
            x, kl = self.layer1(x)
            kl_sum += kl
            x = F.relu(x)
            x = self.layerf(x)
            return x, kl_sum
    

    and my training loop

    model = BNN(X_train.shape[-1], 100).to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    criterion = torch.nn.MSELoss()
    
    for epoch in pbar:
            running_kld_loss = 0
            running_mse_loss = 0
            running_loss = 0
            for datapoints, labels in dataloader_train:
                optimizer.zero_grad()
                
                output, kl = model(datapoints)
                kl = get_kl_loss(model)
                
                # calculate loss with kl term for Bayesian layers
                mse_loss = criterion(output, labels)
                loss = mse_loss + kl * kld_beta / batch_size
                
                loss.backward()
                optimizer.step()
                
                running_mse_loss += mse_loss.detach().numpy()
                running_kld_loss += kl.detach().numpy()
                running_loss += loss.detach().numpy()
                
            status.update({
                'Epoch': epoch, 
                'loss': running_loss/len(dataloader_train),
                'kl': running_kld_loss/len(dataloader_train),
                'mse': running_mse_loss/len(dataloader_train)
            })
    

    When I print the KL loss, it starts at ~5.0 and does not decrease at all.

    opened by gkwt 4
  • Pytorch version issue

    Pytorch version issue

    Hi, it seems that this software only supports Pytorch 1.8.1 LTS or newer version. When I try to use this code in my own project(which is mainly based on Pytorch 1.4.0), some conflicts occur and seemingly unavoidable. Could you please update this software to make it also support some older version Pytorch, for example, 1.4.0. Great Thanks!!!

    opened by hebowei2000 4
  • conv_transpose2d parameter order problem

    conv_transpose2d parameter order problem

    https://github.com/IntelLabs/bayesian-torch/blob/f6f516e9b3721466aa0036c735475a421cc3ce80/bayesian_torch/layers/variational_layers/conv_variational.py#L784-L786

    image

    When I try to convert a deterministic autoencoder into a bayesian one with a ConvTranspose2d layer I get the error "Exception has occurred: TypeError conv_transpose2d(): argument 'groups' (position 7) must be int, not tuple" which I suspect comes from self.dilation and self.group which are swaped.

    opened by pierreosselin 3
  • Let the models return prediction only, saving KL Divergence as an attribute

    Let the models return prediction only, saving KL Divergence as an attribute

    Closes #7 .

    Let the user, if they want, to return predictions only on forward method, while saving kl divergence as an attribute. This is important to make it easier to integrate into PyTorch models.

    Also, it does not break the lib as it is: we added a new parameter on forward method that defaults to True and, if manually set to false, returns predictions only.

    Performed the following changes, on all layers:

    • Added return_kl on all forward methods, defaulting to True. If set to false, won't return kl.
    • Added a new kl attribute to each layer, updating it at every feedforward step. Useful when integrating with already-built PyTorch models.

    That should help integrating with PyTorch experiments while keeping backward compatibility towards this lib.

    enhancement 
    opened by piEsposito 3
  • Fix KL computation

    Fix KL computation

    Hello,

    I think there might be a few problems in your model definitions.

    In particular:

    • in resnet_flipout.py and resnet_variational.py you only sum the kl of the last block inside self.layerN
    • in resnet_flipout_large.py and resnet_variational_large.py you check for is None while you probably want is not None or actually no check at all since it can't be None in any reasonable setting. Also the str(layer) check is odd since it contains a BasicBlock or BottleNeck object (you're looping over an nn.Sequential of blocks). In fact in this code that string check is very likely superfluous (didn't test this, but I did include it in this PR as example)

    I hope you can confirm and perhaps fix these issues, which will help me (and maybe others) in building on your nice codebase :)

    opened by y0ast 3
  • Enable the Bayesian layer to freeze the parameters to their mean values

    Enable the Bayesian layer to freeze the parameters to their mean values

    I think it would be good to provide an option to freeze the weights and biases to the mean value when inferencing. The forward function would somehow look like this:

    def forward(self, input, sample=True):
        if sample:
            # do sampling and forward as in the current code
        else:
            # set weight=self.mu_weight
            # set bias=self.mu_bias
            # (optional) set kl=0, since it is useless in this case
        return out, kl
    
    opened by Nebularaid2000 2
  • FR: Enable forward method of Bayesian Layers to return value only for smoother integration with PyTorch

    FR: Enable forward method of Bayesian Layers to return value only for smoother integration with PyTorch

    It would be nice if we could store the KL divergence value as an attribute of the Bayesian Layers and return them on the forward method only if needed.

    With that we can have less friction on integration with PyTorch. being able to "plug and play" with bayesian-torch layers on deterministic models.

    It would be something like that:

    def forward(self, x, return_kl=False):
        ...
        self.kl = kl
    
        if return_kl:
            return out, kl
        return out   
    

    We then can get it from the bayesian layers when calculating the loss with no harm or hard changes to the code, which might encourage users to try the lib.

    I can work on that also.

    enhancement 
    opened by piEsposito 2
  • modifing the Scripts for any arbitrary output size

    modifing the Scripts for any arbitrary output size

    I am trying to use these repo for classification with 3 output nodes. However, in the fully connected layer, I got the error of size mismatch. Can you help me with this issue?

    question 
    opened by narminGhaffari 2
  • add mu_kernel to of delta_kernel in flipout layers?

    add mu_kernel to of delta_kernel in flipout layers?

    Hi,

    Shouldn't mu_kernel be added to delta_kernel in the code below?

    Best, Lewis

    diff --git a/bayesian_torch/layers/flipout_layers/conv_flipout.py b/bayesian_torch/layers/flipout_layers/conv_flipout.py
    index 4b3e88d..719cfdc 100644
    --- a/bayesian_torch/layers/flipout_layers/conv_flipout.py
    +++ b/bayesian_torch/layers/flipout_layers/conv_flipout.py
    @@ -165,7 +165,7 @@ class Conv1dFlipout(BaseVariationalLayer_):
             sigma_weight = torch.log1p(torch.exp(self.rho_kernel))
             eps_kernel = self.eps_kernel.data.normal_()
     
    -        delta_kernel = (sigma_weight * eps_kernel)
    +        delta_kernel = (sigma_weight * eps_kernel) + self.mu_kernel 
     
             kl = self.kl_div(self.mu_kernel, sigma_weight, self.prior_weight_mu,
                              self.prior_weight_sigma)
    
    opened by burntcobalt 2
  • Kernel_size

    Kernel_size

    The Conv2dReparameterization only allows kernels with same dim (e.g., 2×2) However, some CNN model has different kernels (e.g., in inceptionresnetv2, the kernel size in block17 is 1×7)

    So, I modified the code in conv_variational.py from:

            self.mu_kernel = Parameter(
                torch.Tensor(out_channels, in_channels // groups, kernel_size,
                             kernel_size))
            self.rho_kernel = Parameter(
                torch.Tensor(out_channels, in_channels // groups, kernel_size,
                             kernel_size))
            self.register_buffer(
                'eps_kernel',
                torch.Tensor(out_channels, in_channels // groups, kernel_size,
                             kernel_size),
                persistent=False)
            self.register_buffer(
                'prior_weight_mu',
                torch.Tensor(out_channels, in_channels // groups, kernel_size,
                             kernel_size),
                persistent=False)
            self.register_buffer(
                'prior_weight_sigma',
                torch.Tensor(out_channels, in_channels // groups, kernel_size,
                             kernel_size),
                persistent=False)
    

    to

            self.mu_kernel = Parameter(
                torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                             kernel_size[1]))
            self.rho_kernel = Parameter(
                torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                             kernel_size[1]))
            self.register_buffer(
                'eps_kernel',
                torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                             kernel_size[1]),
                persistent=False)
            self.register_buffer(
                'prior_weight_mu',
                torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                             kernel_size[1]),
                persistent=False)
            self.register_buffer(
                'prior_weight_sigma',
                torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                             kernel_size[1]),
                persistent=False)
    

    also, the kernel_size=d.kernel_size[0] was changes to kernel_size=d.kernel_size in dnn_to_cnn.py

    enhancement 
    opened by flydephone 1
  • How bayesian network be used in quantization?

    How bayesian network be used in quantization?

    @peteriz @jpablomch @ranganathkrishnan we can get the $\mu$ and $\sigma$ of each weight ,so how it can be used to quantization which mapping the $w_{float32 }$ into $w_{int8}$

    enhancement 
    opened by LeopoldACC 1
Releases(v0.3.0)
  • v0.3.0(Dec 14, 2022)

  • v0.2.0-alpha(Jan 27, 2022)

    Includes dnn_to_bnn new feature: An API to convert deterministic deep neural network (dnn) model of any architecture to Bayesian deep neural network (bnn) model, simplifying the model definition i.e. drop-in replacements of Convolutional, Linear and LSTM layers to corresponding Bayesian layers. This will enable seamless conversion of existing topology of larger models to Bayesian deep neural network models for extending towards uncertainty-aware applications.

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jan 27, 2022)

    Includes dnn_to_bnn new feature: An API to convert deterministic deep neural network (dnn) model of any architecture to Bayesian deep neural network (bnn) model, simplifying the model definition i.e. drop-in replacements of Convolutional, Linear and LSTM layers to corresponding Bayesian layers. This will enable seamless conversion of existing topology of larger models to Bayesian deep neural network models for extending towards uncertainty-aware applications.

    Full Changelog: https://github.com/IntelLabs/bayesian-torch/compare/v0.1...v0.2.0

    Source code(tar.gz)
    Source code(zip)
Owner
Intel Labs
Intel Labs
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023
LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstract Object detection aims to locate and classify object instances in ima

IM Lab., POSTECH 0 Sep 28, 2022
This repository is for our paper Exploiting Scene Graphs for Human-Object Interaction Detection accepted by ICCV 2021.

SG2HOI This repository is for our paper Exploiting Scene Graphs for Human-Object Interaction Detection accepted by ICCV 2021. Installation Pytorch 1.7

HT 10 Dec 20, 2022
This library is a location of the LegacyLogger for PyTorch Lightning.

neptune-contrib Documentation See neptune-contrib documentation site Installation Get prerequisites python versions 3.5.6/3.6 are supported Install li

neptune.ai 26 Oct 07, 2021
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test

Omar Laham 2 Oct 25, 2022
Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

As much as this series is to educate aspiring computer programmers and data scientists of all ages and all backgrounds, it is also a reminder to mysel

EKA foundation 758 Dec 25, 2022
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-

Khanh Nguyen 131 Oct 21, 2022
Python Algorithm Interview Book Review

파이썬 알고리즘 인터뷰 책 리뷰 리뷰 IT 대기업에 들어가고 싶은 목표가 있다. 내가 꿈꿔온 회사에서 일하는 사람들의 모습을 보면 멋있다고 생각이 들고 나의 목표에 대한 열망이 강해지는 것 같다. 미래의 핵심 사업 중 하나인 SW 부분을 이끌고 발전시키는 우리나라의 I

SharkBSJ 1 Dec 14, 2021
Informal Persian Universal Dependency Treebank

Informal Persian Universal Dependency Treebank (iPerUDT) Informal Persian Universal Dependency Treebank, consisting of 3000 sentences and 54,904 token

Roya Kabiri 0 Jan 05, 2022
This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

CRGNN Paper : Improving the Training of Graph Neural Networks with Consistency Regularization Environments Implementing environment: GeForce RTX™ 3090

THUDM 28 Dec 09, 2022
Machine Learning Model deployment for Container (TensorFlow Serving)

try_tf_serving ├───dataset │ ├───testing │ │ ├───paper │ │ ├───rock │ │ └───scissors │ └───training │ ├───paper │ ├───rock

Azhar Rizki Zulma 5 Jan 07, 2022
A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Reinforcement-Learning-Notebooks A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented

Pulkit Khandelwal 1k Dec 28, 2022
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

Keon Lee 157 Jan 01, 2023
Graph Analysis From Scratch

Graph Analysis From Scratch Goal In this notebook we wanted to implement some functionalities to analyze a weighted graph only by using algorithms imp

Arturo Ghinassi 0 Sep 17, 2022
This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision".

AS-MLP architecture for Image Classification Model Zoo Image Classification on ImageNet-1K Network Resolution Top-1 (%) Params FLOPs Throughput (image

SVIP Lab 106 Dec 12, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)

This video in better quality. einops Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, and

Alex Rogozhnikov 6.2k Jan 01, 2023
SMPL-X: A new joint 3D model of the human body, face and hands together

SMPL-X: A new joint 3D model of the human body, face and hands together [Paper Page] [Paper] [Supp. Mat.] Table of Contents License Description News I

Vassilis Choutas 1k Jan 09, 2023