Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Overview

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

This repository is the official implementation for the following paper Analytic-LISTA networks proposed in the following paper:

"Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently" by Xiaohan Chen, Jason Zhang and Zhangyang Wang from the VITA Research Group.

The code implements the Peek-a-Boo (PaB) algorithm for various convolutional networks and is tested in Linux environment with Python: 3.7.2, PyTorch 1.7.0+.

Getting Started

Dependency

pip install tqdm

Prerequisites

  • Python 3.7+
  • PyTorch 1.7.0+
  • tqdm

Data Preparation

To run ImageNet experiments, download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively as shown below. A useful script for automatic extraction can be found here.

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

How to Run Experiments

CIFAR-10/100 Experiments

To apply PaB w/ PSG to a ResNet-18 network on CIFAR-10/100 datasets, use the following command:

python main.py --use-cuda 0 \
    --arch PsgResNet18 --init-method kaiming_normal \
    --optimizer BOP --ar 1e-3 --tau 1e-6 \
    --ar-decay-freq 45 --ar-decay-ratio 0.15 --epochs 180 \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio 3e-1 --prune-iters 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-threshold 1e-7 --psg-no-take-sign --psg-sparsify \
    --exp-name cifar10_resnet18_pab-psg

To break down the above complex command, PaB includes two stages (pruning and Bop training) and consists of three components (a pruner, a Bop optimizer and a PSG module).

[Pruning module] The pruning module is controlled by the following arguments:

  • --pruner - A string that indicates which pruning method to be used. Valid choices are ['Mag', 'SNIP', 'GraSP', 'SynFlow'].
  • --prune-epoch - An integer, the epoch index of when (the last) pruning is performed.
  • --prune-ratio - A float, the ratio of non-zero parameters remained after (the last) pruning
  • --prune-iters - An integeer, the number of pruning iterations in one run of pruning. Check the SynFlow paper for what this means.

[Bop optimizer] Bop has several hyperparameters that are essential to its successful optimizaiton as shown below. More details can be found in the original Bop paper.

  • --optimizer - A string that specifies the Bop optimizer. You can pass 'SGD' to this argument for a standard training of SGD. Check here.
  • --ar - A float, corresponding to the adativity rate for the calculation of gradient moving average.
  • --tau - A float, corresponding to the threshold that decides if a binary weight needs to be flipped.
  • --ar-decay-freq - An integer, interval in epochs between decays of the adaptivity ratio.
  • --ar-decay-ratio - A float, the decay ratio of the adaptivity ratio decaying.

[PSG module] PSG stands for Predictive Sign Gradient, which was originally proposed in the E2-Train paper. PSG uses low-precision computation during backward passes to save computational cost. It is controlled by several arguments.

  • --msb-bits, --msb-bits-weight, --msb-bits-grad - Three floats, the bit-width for the inputs, weights and output errors during back-propagation.
  • --psg-threshold - A float, the threshold that filters out coarse gradients with small magnitudes to reduce gradient variance.
  • --psg-no-take-sign - A boolean that indicates to bypass the "taking-the-sign" step in the original PSG method.
  • --psg-sparsify - A boolean. The filtered small gradients are set to zero when it is true.

ImageNet Experiments

For PaB experiments on ImageNet, we run the pruning and Bop training in a two-stage manner, implemented in main_imagenet_prune.py and main_imagenet_train.py, respectively.

To prune a ResNet-50 network at its initialization, we first run the following command to perform SynFlow, which follows a similar manner for the arguments as in CIFAR experiments:

export prune_ratio=0.5  # 50% remaining parameters.

# Run SynFlow pruning
python main_imagenet_prune.py \
    --arch resnet50 --init-method kaiming_normal \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio $prune_ratio --prune-iters 100 \
    --pruned-save-name /path/to/the/pruning/output/file \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset

We then train the pruned model using Bop with PSG on one node with multi-GPUs.

# Bop hyperparameters
export bop_ar=1e-3
export bop_tau=1e-6
export psg_threshold="-5e-7"

python main_imagenet_train.py \
    --arch psg_resnet50 --init-method kaiming_normal \
    --optimizer BOP --ar $bop_ar --tau $bop_tau \
    --ar-decay-freq 30 --ar-decay-ratio 0.15 --epochs 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-sparsify --psg-threshold " ${psg_threshold}" --psg-no-take-sign \
    --savedir /path/to/the/output/dir \
    --resume /path/to/the/pruning/output/file \
    --exp-name 'imagenet_resnet50_pab-psg' \
    --dist-url 'tcp://127.0.0.1:2333' \
    --dist-backend 'nccl' --multiprocessing-distributed \
    --world-size 1 --rank 0 \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset 

Acknowledgement

Thank you to Jason Zhang for helping with the development of the code repo, the research that we conducted with it and the consistent report after his movement to CMU. Thank you to Prof. Zhangyang Wang for the guidance and unreserved help with this project.

Cite this work

If you find this work or our code implementation helpful for your own resarch or work, please cite our paper.

@inproceedings{
chen2022peek,
title={Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently},
author={Xiaohan Chen and Jason Zhang and Zhangyang Wang},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=moHCzz6D5H3},
}
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
StarGAN-ZSVC: Unofficial PyTorch Implementation

This repository is an unofficial PyTorch implementation of StarGAN-ZSVC by Matthew Baas and Herman Kamper. This repository provides both model architectures and the code to inference or train them.

Jirayu Burapacheep 11 Aug 28, 2022
Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

OpenDet Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022) Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-So

csuhan 64 Jan 07, 2023
Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Taxonomizing local versus global structure in neural network loss landscapes Int

Yaoqing Yang 8 Dec 30, 2022
Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

Tempo: Optimizing risk-based breast cancer screening policies with reinforcement learning Introduction This repository was used to develop Tempo, as d

Adam Yala 12 Oct 11, 2022
image scene graph generation benchmark

Scene Graph Benchmark in PyTorch 1.7 This project is based on maskrcnn-benchmark Highlights Upgrad to pytorch 1.7 Multi-GPU training and inference Bat

Microsoft 303 Dec 27, 2022
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

NeoDTI NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).

62 Nov 26, 2022
Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Myo Keylogging This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Ga

Secure Mobile Networking Lab 7 Jan 03, 2023
The official implementation of Variable-Length Piano Infilling (VLI).

Variable-Length-Piano-Infilling The official implementation of Variable-Length Piano Infilling (VLI). (paper: Variable-Length Music Score Infilling vi

29 Sep 01, 2022
Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Deep Semisupervised Multiview Learning With Increasing Views (ISVN, IEEE TCYB) Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin, Huaibai Yan, Dez

3 Nov 19, 2022
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect vi

MUGEN 11 Oct 22, 2022
A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

SynSense 21 Dec 14, 2022
TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Parameterization of Hypercomplex Multiplications (PHM) This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex

Aston Zhang 9 Oct 26, 2022
MakeItTalk: Speaker-Aware Talking-Head Animation

MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation

Adobe Research 285 Jan 08, 2023
In this project we combine techniques from neural voice cloning and musical instrument synthesis to achieve good results from as little as 16 seconds of target data.

Neural Instrument Cloning In this project we combine techniques from neural voice cloning and musical instrument synthesis to achieve good results fro

Erland 127 Dec 23, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023
Implementation of H-UCRL Algorithm

Implementation of H-UCRL Algorithm This repository is an implementation of the H-UCRL algorithm introduced in Curi, S., Berkenkamp, F., & Krause, A. (

Sebastian Curi 25 May 20, 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022
ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021) Project Page | Video | Paper | Data We present a novel metho

65 Nov 28, 2022
PROJECT - Az Residential Real Estate Analysis

AZ RESIDENTIAL REAL ESTATE ANALYSIS -Decided on libraries to import. Includes pa

2 Jul 05, 2022
A Python library for unevenly-spaced time series analysis

traces A Python library for unevenly-spaced time series analysis. Why? Taking measurements at irregular intervals is common, but most tools are primar

Datascope Analytics 516 Dec 29, 2022