《Truly shift-invariant convolutional neural networks》(2021)

Overview

Truly shift-invariant convolutional neural networks [Paper]

Authors: Anadi Chaman and Ivan Dokmanić

Convolutional neural networks were always assumed to be shift invariant, until recently when it was shown that the classification accuracy of a trained CNN can take a serious hit with merely a 1-pixel shift in input image. One of the primary reasons for this problem is the use of downsampling (popularly known as stride) layers in the networks.

In this work, we present Adaptive Polyphase Sampling (APS), an easy-to-implement non-linear downsampling scheme that completely gets rid of this problem. The resulting CNNs yield 100% consistency in classification performance under shifts without any loss in accuracy. In fact, unlike prior works, the networks exhibit perfect consistency even before training, making it the first approach that makes CNNs truly shift invariant.

This repository contains our code in PyTorch to implement APS.

ImageNet training

To train ResNet-18 model with APS on ImageNet use the following commands (training and evaluation with circular shifts).

cd imagenet_exps
python3 main.py --out-dir OUT_DIR --arch resnet18_aps1 --seed 0 --data PATH-TO-DATASET

For training on multiple GPUs:

cd imagenet_exps
python3 main.py --out-dir OUT_DIR --arch resnet18_aps1 --seed 0 --data PATH-TO-DATASET --workers NUM_WORKERS --dist-url tcp://127.0.0.1:FREE-PORT --dist-backend nccl --multiprocessing-distributed --world-size 1 --rank 0

--arch is used to specify the architecture. To use ResNet18 with APS layer and blur filter of size j, pass 'resnet18_apsj' as the argument to --arch. List of currently supported network architectures are here.

--circular_data_aug can be used to additionally train the networks with random circular shifts.

Results are saved in OUT_DIR.

CIFAR-10 training

The following commands run our implementation on CIFAR-10 dataset.

cd cifar10_exps
python3 main.py --arch 'resnet18_aps' --filter_size FILTER_SIZE --validate_consistency --seed_num 0 --device_id 0 --model_folder CURRENT_MODEL_DIRECTORY --results_root_path ROOT_DIRECTORY --dataset_path PATH-TO-DATASET

--data_augmentation_flag can be used to additionally train the networks with randomly shifted images. FILTER_SIZE can take the values between 1 to 7. The list of CNN architectures currently supported can be found here.

The results are saved in the path: ROOT_DIRECTORY/CURRENT_MODEL_DIRECTORY/

Owner
Anadi Chaman
Anadi Chaman
Bio-OFC gym implementation and Gym-Fly environment

Bio-OFC gym implementation and Gym-Fly environment This repository includes the gym compatible implementation of the Bio-OFC algorithm from the paper

Siavash Golkar 1 Nov 16, 2021
Rational Activation Functions - Replacing Padé Activation Units

Rational Activations - Learnable Rational Activation Functions First introduce as PAU in Padé Activation Units: End-to-end Learning of Activation Func

<a href=[email protected]"> 38 Nov 22, 2022
[CVPR 2021] MiVOS - Scribble to Mask module

MiVOS (CVPR 2021) - Scribble To Mask Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] A simplistic network that turns scri

Rex Cheng 65 Dec 22, 2022
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation

PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd

SUN Group @ UMN 26 Nov 16, 2022
A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

Google 5.5k Jan 07, 2023
Unofficial implementation of Pix2SEQ

Unofficial-Pix2seq: A Language Modeling Framework for Object Detection Unofficial implementation of Pix2SEQ. Please use this code with causion. Many i

159 Dec 12, 2022
Automatic packaging of the open-composite libs for OvGME

OvGME Packager for OpenXR – OpenComposite for DCS Note This repository is currently unsupported and needs to be migrated to the upstream OpenComposite

12 Nov 03, 2022
FID calculation with proper image resizing and quantization steps

clean-fid: Fixing Inconsistencies in FID Project | Paper The FID calculation involves many steps that can produce inconsistencies in the final metric.

Gaurav Parmar 606 Jan 06, 2023
OpenLT: An open-source project for long-tail classification

OpenLT: An open-source project for long-tail classification Supported Methods for Long-tailed Recognition: Cross-Entropy Loss Focal Loss (ICCV'17) Cla

Ming Li 37 Sep 15, 2022
An educational resource to help anyone learn deep reinforcement learning.

Status: Maintenance (expect bug fixes and minor updates) Welcome to Spinning Up in Deep RL! This is an educational resource produced by OpenAI that ma

OpenAI 7.6k Jan 09, 2023
DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing Figure: Joint multi-attribute edits using DyStyle model. Great diversity

74 Dec 03, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
vit for few-shot classification

Few-Shot ViT Requirements PyTorch (= 1.9) TorchVision timm (latest) einops tqdm numpy scikit-learn scipy argparse tensorboardx Pretrained Checkpoints

Martin Dong 26 Nov 30, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices,

Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices, Linh Van Ma, Tin Trung Tran, Moongu Jeon, ICAIIC 2022 (The 4th

Linh 11 Oct 10, 2022
IDA file loader for UF2, created for the DEFCON 29 hardware badge

UF2 Loader for IDA The DEFCON 29 badge uses the UF2 bootloader, which conveniently allows you to dump and flash the firmware over USB as a mass storag

Kevin Colley 6 Feb 08, 2022
WiFi-based Multi-task Sensing

WiFi-based Multi-task Sensing Introduction WiFi-based sensing has aroused immense attention as numerous studies have made significant advances over re

zhangx289 6 Nov 24, 2022
Riemannian Convex Potential Maps

Modeling distributions on Riemannian manifolds is a crucial component in understanding non-Euclidean data that arises, e.g., in physics and geology. The budding approaches in this space are limited b

Facebook Research 61 Nov 28, 2022
Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization (CVPR 2021) This is the official implementation of PW

Intelligent Robotics and Machine Vision Lab 42 Dec 18, 2022
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset This repository contains links to data and code to fetch and reproduce

Daniel Varab 19 Dec 16, 2022