Code for the Paper: Conditional Variational Capsule Network for Open Set Recognition

Overview

Conditional Variational Capsule Network for Open Set Recognition

arXiv arXiv

This repository hosts the official code related to "Conditional Variational Capsule Network for Open Set Recognition", Y. Guo, G. Camporese, W. Yang, A. Sperduti, L. Ballan, arXiv:2104.09159, 2021. [Download]

alt text

If you use the code/models hosted in this repository, please cite the following paper and give a star to the repo:

@misc{guo2021conditional,
      title={Conditional Variational Capsule Network for Open Set Recognition}, 
      author={Yunrui Guo and Guglielmo Camporese and Wenjing Yang and Alessandro Sperduti and Lamberto Ballan},
      year={2021},
      eprint={2104.09159},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Updates

  • [2021/04/09] - The code is online,
  • [2021/07/22] - The paper has been accepted to ICCV-2021!

Install

Once you have cloned the repo, all the commands below should be runned inside the main project folder cvaecaposr:

# Clone the repo
$ git clone https://github.com/guglielmocamporese/cvaecaposr.git

# Go to the project directory
$ cd cvaecaposr

To run the code you need to have conda installed (version >= 4.9.2).

Furthermore, all the requirements for running the code are specified in the environment.yaml file and can be installed with:

# Install the conda env
$ conda env create --file environment.yaml

# Activate the conda env
$ conda activate cvaecaposr

Dataset Splits

You can find the dataset splits for all the datasets we have used (i.e. for MNIST, SVHN, CIFAR10, CIFAR+10, CIFAR+50 and TinyImageNet) in the splits.py file.

When you run the code the datasets will be automatically downloaded in the ./data folder and the split number selected is determined by the --split_num argument specified when you run the main.py file (more on how to run the code in the Experiment section below).

Model Checkpoints

You can download the model checkpoints using the download_checkpoints.sh script in the scripts folder by running:

# Extend script permissions
$ chmod +x ./scripts/download_checkpoints.sh

# Download model checkpoints
$ ./scripts/download_checkpoints.sh

After the download you will find the model checkpoints in the ./checkpoints folder:

  • ./checkpoints/mnist.ckpt
  • ./checkpoints/svhn.ckpt
  • ./checkpoints/cifar10.ckpt
  • ./checkpoints/cifar+10.ckpt
  • ./checkpoints/cifar+50.ckpt
  • ./checkpoints/tiny_imagenet.ckpt

The size of each checkpoint file is between ~370 MB and ~670 MB.

Experiments

For all the experiments we have used a GeForce RTX 2080 Ti (11GB of memory) GPU.

For the training you will need ~7300 MiB of GPU memory whereas for test ~5000 MiB of GPU memory.

Train

The CVAECapOSR model can be trained using the main.py program. Here we reported an example of a training script for the mnist experiment:

# Train
$ python main.py \
      --data_base_path "./data" \
      --dataset "mnist" \
      --val_ratio 0.2 \
      --seed 1234 \
      --batch_size 32 \
      --split_num 0 \
      --z_dim 128 \
      --lr 5e-5 \
      --t_mu_shift 10.0 \
      --t_var_scale 0.1 \
      --alpha 1.0 \
      --beta 0.01 \
      --margin 10.0 \
      --in_dim_caps 16 \
      --out_dim_caps 32 \
      --checkpoint "" \
      --epochs 100 \
      --mode "train"

For simplicity we provide all the training scripts for the different datasets in the scripts folder. Specifically, you will find:

  • train_mnist.sh
  • train_svhn.sh
  • train_cifar10.sh
  • train_cifar+10.sh
  • train_cifar+50.sh
  • train_tinyimagenet.sh

that you can run as follows:

# Extend script permissions
$ chmod +x ./scripts/train_{dataset}.sh # where you have to set a dataset name

# Run training
$ ./scripts/train_{dataset}.sh # where you have to set a dataset name

All the temporary files of the training stage (model checkpoints, tensorboard metrics, ...) are created at ./tmp/{dataset}/version_{version_number}/ where the dataset is specified in the train_{dataset}.sh script and version_number is an integer number that is tracked and computed automatically in order to not override training logs (each training will create unique files in different folders, with different versions).

Test

The CVAECapOSR model can be tested using the main.py program. Here we reported an example of a test script for the mnist experiment:

# Test
$ python main.py \
      --data_base_path "./data" \
      --dataset "mnist" \
      --val_ratio 0.2 \
      --seed 1234 \
      --batch_size 32 \
      --split_num 0 \
      --z_dim 128 \
      --lr 5e-5 \
      --t_mu_shift 10.0 \
      --t_var_scale 0.1 \
      --alpha 1.0 \
      --beta 0.01 \
      --margin 10.0 \
      --in_dim_caps 16 \
      --out_dim_caps 32 \
      --checkpoint "checkpoints/mnist.ckpt" \
      --mode "test"

For simplicity we provide all the test scripts for the different datasets in the scripts folder. Specifically, you will find:

  • test_mnist.sh
  • test_svhn.sh
  • test_cifar10.sh
  • test_cifar+10.sh
  • test_cifar+50.sh
  • test_tinyimagenet.sh

that you can run as follows:

# Extend script permissions
$ chmod +x ./scripts/test_{dataset}.sh # where you have to set a dataset name

# Run training
$ ./scripts/test_{dataset}.sh # where you have to set a dataset name

Model Reconstruction

Here we reported the reconstruction of some test samples of the model after training.

MNIST
alt text
SVHN
alt text
CIFAR10
alt text
TinyImageNet
alt text
Owner
Guglielmo Camporese
PhD Student in Brain, Mind and Computer Science and Applied Scientist Intern at Amazon. Machine Learning for Videos, Images and Audio Speech contexts.
Guglielmo Camporese
Combinatorially Hard Games where the levels are procedurally generated

puzzlegen Implementation of two procedurally simulated environments with gym interfaces. IceSlider: the agent needs to reach and stop on the pink squa

Autonomous Learning Group 3 Jun 26, 2022
For medical image segmentation

LeViT_UNet For medical image segmentation Our model is based on LeViT (https://github.com/facebookresearch/LeViT). You'd better gitclone its codes. Th

13 Dec 24, 2022
Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

AAVAE Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders" Abstract Recent methods for self-supervised learnin

Grid AI Labs 48 Dec 12, 2022
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

Playground for CLIP-like models Demo Colab Link GradCAM Visualization Naive Zero-shot Detection Smarter Zero-shot Detection Captcha Solver Changelog 2

Kevin Zakka 101 Dec 30, 2022
Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

HiFi-GAN+ This project is an unoffical implementation of the HiFi-GAN+ model for audio bandwidth extension, from the paper Bandwidth Extension is All

Brent M. Spell 134 Dec 30, 2022
Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read

Pete Warden 279 Jan 03, 2023
Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

SUSE Cloud Native Foundations Scholarship Udacity is collaborating with SUSE, a global leader in true open source solutions, to empower developers and

Shivansh Srivastava 34 Oct 18, 2022
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

[Paper] [Project page] This repository contains code for the paper: Andrew Owens, Alexei A. Efros. Audio-Visual Scene Analysis with Self-Supervised Mu

Andrew Owens 202 Dec 13, 2022
Blender Add-On for slicing meshes with planes

MeshSlicer Blender Add-On for slicing meshes with multiple overlapping planes at once. This is a simple Blender addon to slice a silmple mesh with mul

52 Dec 12, 2022
The missing CMake project initializer

cmake-init - The missing CMake project initializer Opinionated CMake project initializer to generate CMake projects that are FetchContent ready, separ

1k Jan 01, 2023
Fast Scattering Transform with CuPy/PyTorch

Announcement 11/18 This package is no longer supported. We have now released kymatio: http://www.kymat.io/ , https://github.com/kymatio/kymatio which

Edouard Oyallon 289 Dec 07, 2022
Implements a fake news detection program using classifiers.

Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz

Apostolos Karvelas 1 Jan 09, 2022
CAR-API: Cityscapes Attributes Recognition API

CAR-API: Cityscapes Attributes Recognition API This is the official api to download and fetch attributes annotations for Cityscapes Dataset. Content I

Kareem Metwaly 5 Dec 22, 2022
PyTorch implementation of "VRT: A Video Restoration Transformer"

VRT: A Video Restoration Transformer Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool Computer

Jingyun Liang 837 Jan 09, 2023
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

TorchCAM: class activation explorer Simple way to leverage the class-specific activation of convolutional layers in PyTorch. Quick Tour Setting your C

F-G Fernandez 1.2k Dec 29, 2022
Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

Neuron Merging: Compensating for Pruned Neurons Pytorch implementation of Neuron Merging: Compensating for Pruned Neurons, accepted at 34th Conference

Woojeong Kim 33 Dec 30, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

FS-QAT: Few Shot Temporal Action Localization using Query Adaptive Transformer Accepted as Poster in BMVC 2021 This is an official implementation in P

Sauradip Nag 14 Dec 09, 2022
Codes for paper "KNAS: Green Neural Architecture Search"

KNAS Codes for paper "KNAS: Green Neural Architecture Search" KNAS is a green (energy-efficient) Neural Architecture Search (NAS) approach. It contain

90 Dec 22, 2022
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma 🔥 News 2021-10

Jingtao Zhan 99 Dec 27, 2022