Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Overview

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

This repository is the official implementation for the following paper Analytic-LISTA networks proposed in the following paper:

"Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently" by Xiaohan Chen, Jason Zhang and Zhangyang Wang from the VITA Research Group.

The code implements the Peek-a-Boo (PaB) algorithm for various convolutional networks and is tested in Linux environment with Python: 3.7.2, PyTorch 1.7.0+.

Getting Started

Dependency

pip install tqdm

Prerequisites

  • Python 3.7+
  • PyTorch 1.7.0+
  • tqdm

Data Preparation

To run ImageNet experiments, download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively as shown below. A useful script for automatic extraction can be found here.

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

How to Run Experiments

CIFAR-10/100 Experiments

To apply PaB w/ PSG to a ResNet-18 network on CIFAR-10/100 datasets, use the following command:

python main.py --use-cuda 0 \
    --arch PsgResNet18 --init-method kaiming_normal \
    --optimizer BOP --ar 1e-3 --tau 1e-6 \
    --ar-decay-freq 45 --ar-decay-ratio 0.15 --epochs 180 \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio 3e-1 --prune-iters 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-threshold 1e-7 --psg-no-take-sign --psg-sparsify \
    --exp-name cifar10_resnet18_pab-psg

To break down the above complex command, PaB includes two stages (pruning and Bop training) and consists of three components (a pruner, a Bop optimizer and a PSG module).

[Pruning module] The pruning module is controlled by the following arguments:

  • --pruner - A string that indicates which pruning method to be used. Valid choices are ['Mag', 'SNIP', 'GraSP', 'SynFlow'].
  • --prune-epoch - An integer, the epoch index of when (the last) pruning is performed.
  • --prune-ratio - A float, the ratio of non-zero parameters remained after (the last) pruning
  • --prune-iters - An integeer, the number of pruning iterations in one run of pruning. Check the SynFlow paper for what this means.

[Bop optimizer] Bop has several hyperparameters that are essential to its successful optimizaiton as shown below. More details can be found in the original Bop paper.

  • --optimizer - A string that specifies the Bop optimizer. You can pass 'SGD' to this argument for a standard training of SGD. Check here.
  • --ar - A float, corresponding to the adativity rate for the calculation of gradient moving average.
  • --tau - A float, corresponding to the threshold that decides if a binary weight needs to be flipped.
  • --ar-decay-freq - An integer, interval in epochs between decays of the adaptivity ratio.
  • --ar-decay-ratio - A float, the decay ratio of the adaptivity ratio decaying.

[PSG module] PSG stands for Predictive Sign Gradient, which was originally proposed in the E2-Train paper. PSG uses low-precision computation during backward passes to save computational cost. It is controlled by several arguments.

  • --msb-bits, --msb-bits-weight, --msb-bits-grad - Three floats, the bit-width for the inputs, weights and output errors during back-propagation.
  • --psg-threshold - A float, the threshold that filters out coarse gradients with small magnitudes to reduce gradient variance.
  • --psg-no-take-sign - A boolean that indicates to bypass the "taking-the-sign" step in the original PSG method.
  • --psg-sparsify - A boolean. The filtered small gradients are set to zero when it is true.

ImageNet Experiments

For PaB experiments on ImageNet, we run the pruning and Bop training in a two-stage manner, implemented in main_imagenet_prune.py and main_imagenet_train.py, respectively.

To prune a ResNet-50 network at its initialization, we first run the following command to perform SynFlow, which follows a similar manner for the arguments as in CIFAR experiments:

export prune_ratio=0.5  # 50% remaining parameters.

# Run SynFlow pruning
python main_imagenet_prune.py \
    --arch resnet50 --init-method kaiming_normal \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio $prune_ratio --prune-iters 100 \
    --pruned-save-name /path/to/the/pruning/output/file \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset

We then train the pruned model using Bop with PSG on one node with multi-GPUs.

# Bop hyperparameters
export bop_ar=1e-3
export bop_tau=1e-6
export psg_threshold="-5e-7"

python main_imagenet_train.py \
    --arch psg_resnet50 --init-method kaiming_normal \
    --optimizer BOP --ar $bop_ar --tau $bop_tau \
    --ar-decay-freq 30 --ar-decay-ratio 0.15 --epochs 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-sparsify --psg-threshold " ${psg_threshold}" --psg-no-take-sign \
    --savedir /path/to/the/output/dir \
    --resume /path/to/the/pruning/output/file \
    --exp-name 'imagenet_resnet50_pab-psg' \
    --dist-url 'tcp://127.0.0.1:2333' \
    --dist-backend 'nccl' --multiprocessing-distributed \
    --world-size 1 --rank 0 \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset 

Acknowledgement

Thank you to Jason Zhang for helping with the development of the code repo, the research that we conducted with it and the consistent report after his movement to CMU. Thank you to Prof. Zhangyang Wang for the guidance and unreserved help with this project.

Cite this work

If you find this work or our code implementation helpful for your own resarch or work, please cite our paper.

@inproceedings{
chen2022peek,
title={Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently},
author={Xiaohan Chen and Jason Zhang and Zhangyang Wang},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=moHCzz6D5H3},
}
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

Pytorch Implementation of Improv RNN Overview This code is a pytorch implementation of the popular Improv RNN model originally implemented by the Mage

Sebastian Murgul 3 Nov 11, 2022
Official repository for Fourier model that can generate periodic signals

Conditional Generation of Periodic Signals with Fourier-Based Decoder Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi This repository provides offi

8 May 25, 2022
Language-Driven Semantic Segmentation

Language-driven Semantic Segmentation (LSeg) The repo contains official PyTorch Implementation of paper Language-driven Semantic Segmentation. Authors

Intelligent Systems Lab Org 416 Jan 03, 2023
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 37 Nov 27, 2022
How to Train a GAN? Tips and tricks to make GANs work

(this list is no longer maintained, and I am not sure how relevant it is in 2020) How to Train a GAN? Tips and tricks to make GANs work While research

Soumith Chintala 10.8k Dec 31, 2022
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project

This repository has become incompatible with the latest and recommended version of Tensorflow 2.0 Instead of refactoring this code painfully, I create

M Faber 769 Dec 08, 2022
A Domain-Agnostic Benchmark for Self-Supervised Learning

DABS: A Domain Agnostic Benchmark for Self-Supervised Learning This repository contains the code for DABS, a benchmark for domain-agnostic self-superv

Alex Tamkin 81 Dec 09, 2022
Our VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks.

VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks. VMAgent is constructed based on one month r

56 Dec 12, 2022
Top #1 Submission code for the first https://alphamev.ai MEV competition with best AUC (0.9893) and MSE (0.0982).

alphamev-winning-submission Top #1 Submission code for the first alphamev MEV competition with best AUC (0.9893) and MSE (0.0982). The code won't run

70 Oct 29, 2022
Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ryuichiro Hataya 50 Dec 05, 2022
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

DeeBERT This is the code base for the paper DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. Code in this repository is also available

Castorini 132 Nov 14, 2022
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

scikit-learn 52.5k Jan 08, 2023
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)

Fast Axiomatic Attribution for Neural Networks This is the official repository accompanying the NeurIPS 2021 paper: R. Hesse, S. Schaub-Meyer, and S.

Visual Inference Lab @TU Darmstadt 11 Nov 21, 2022
Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Information Gain Filtration Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF sho

4 Jul 28, 2022
Using modified BiSeNet for face parsing in PyTorch

face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr

zll 1.6k Jan 08, 2023
Reimplementation of Learning Mesh-based Simulation With Graph Networks

Pytorch Implementation of Learning Mesh-based Simulation With Graph Networks This is the unofficial implementation of the approach described in the pa

Jingwei Xu 33 Dec 14, 2022
This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation

This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation (Guillaume Couairon, Holger

Meta Research 31 Oct 17, 2022
TensorFlow, PyTorch and Numpy layers for generating Orthogonal Polynomials

OrthNet TensorFlow, PyTorch and Numpy layers for generating multi-dimensional Orthogonal Polynomials 1. Installation 2. Usage 3. Polynomials 4. Base C

Chuan 29 May 25, 2022