Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Related tags

Deep Learning1xN
Overview

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

This is Pytorch re-implementation of "1xN Pattern for Pruning Convolutional Neural Networks". A more formal project will be released as soon as we are given the authority from Alibaba Group.

1) 1×N Block Pruning

Requirements

  • Python 3.7
  • Pytorch >= 1.0.1
  • CUDA = 10.0.0

Code Running

To reproduce our experiments, please use the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--job_dir ./experiment/ \
--data_path [DATA_PATH] \
--pretrained_model [PRETRAIN_MODEL_PATH] \
--pr_target 0.5 \
--N 4 (or 2, 8, 16, 32) \
--conv_type BlockL1Conv \
--train_batch_size 256 \
--eval_batch_size 256 \
--rearrange \

Accuracy Performance

Table 1: Performance comparison of our 1×N block sparsity against weight pruning and filter pruning (p = 50%).

MobileNet-V1 Top-1 Acc. Top-5 Acc. Model Link
Weight Pruning 70.764 89.592 Pruned Model
Filter Pruning 65.348 86.264 Pruned Model
1 x 2 Block 70.281 89.370 Pruned Model
1 x 4 Block 70.052 89.056 Pruned Model
1 x 8 Block 69.908 89.027 Pruned Model
1 x 16 Block 69.559 88.933 Pruned Model
1 x 32 Block 69.541 88.801 Pruned Model
MobileNet-V2 Top-1 Acc. Top-5 Acc. Model Link
Weight Pruning 71.146 89.872 Pruned Model
Filter Pruning 66.730 87.190 Pruned Model
1 x 2 Block 70.233 89.417 Pruned Model
1 x 4 Block 60.706 89.165 Pruned Model
1 x 8 Block 69.372 88.862 Pruned Model
1 x 16 Block 69.352 88.708 Pruned Model
1 x 32 Block 68.762 88.425 Pruned Model
MobileNet-V3-small Top-1 Acc. Top-5 Acc. Model Link
Weight Pruning 66.376 86.868 Pruned Model
Filter Pruning 59.054 81.713 Pruned Model
1 x 2 Block 65.380 86.060 Pruned Model
1 x 4 Block 64.465 85.495 Pruned Model
1 x 8 Block 64.101 85.274 Pruned Model
1 x 16 Block 63.126 84.203 Pruned Model
1 x 32 Block 62.881 83.982 Pruned Model
MobileNet-V3-large Top-1 Acc. Top-5 Acc. Model Link
Weight Pruning 72.897 91.093 Pruned Model
Filter Pruning 69.137 89.097 Pruned Model
1 x 2 Block 72.120 90.677 Pruned Model
1 x 4 Block 71.935 90.458 Pruned Model
1 x 8 Block 71.478 90.163 Pruned Model
1 x 16 Block 71.112 90.129 Pruned Model
1 x 32 Block 70.769 89.696 Pruned Model

More links for pruned models under different pruning rates and their training logs can be found in MobileNet-V2 and ResNet-50.

Evaluate our models

To verify the performance of our pruned models, download our pruned models from the links provided above and run the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--data_path [DATA_PATH] \
--conv_type DenseConv \
--evaluate [PRUNED_MODEL_PATH] \
--eval_batch_size 256 \

Arguments

optional arguments:
  -h, --help            show this help message and exit
  --gpus                Select gpu_id to use. default:[0]
  --data_path           The dictionary where the data is stored.
  --job_dir             The directory where the summaries will be stored.
  --resume              Load the model from the specified checkpoint.
  --pretrain_model      Path of the pre-trained model.
  --pruned_model        Path of the pruned model to evaluate.
  --arch                Architecture of model. For ImageNet :mobilenet_v1, mobilenet_v2, mobilenet_v3_small, mobilenet_v3_large
  --num_epochs          The num of epochs to train. default:180
  --train_batch_size    Batch size for training. default:256
  --eval_batch_size     Batch size for validation. default:100
  --momentum            Momentum for Momentum Optimizer. default:0.9
  --lr LR               Learning rate. default:1e-2
  --lr_decay_step       The iterval of learn rate decay for cifar. default:100 150
  --lr_decay_freq       The frequecy of learn rate decay for Imagenet. default:30
  --weight_decay        The weight decay of loss. default:4e-5
  --lr_type             lr scheduler. default: cos. optional:exp/cos/step/fixed
  --use_dali            If this parameter exists, use dali module to load ImageNet data (benefit in training acceleration).
  --conv_type           Importance criterion of filters. Default: BlockL1Conv. optional: BlockRandomConv, DenseConv
  --pr_target           Pruning rate. default:0.5
  --full                If this parameter exists, prune fully-connected layer.
  --N                   Consecutive N kernels for removal (see paper for details).
  --rearrange           If this parameter exists, filters will be rearranged (see paper for details).
  --export_onnx         If this parameter exists, export onnx model.

2)Filter Rearrangement

Table 2: Performance studies of our 1×N block sparsity with and without filter rearrangement (p=50%).

N = 2 Top-1 Acc. Top-5 Acc. Model Link
w/o Rearange 69.900 89.296 Pruned Model
Rearrange 70.233 89.417 Pruned Model
N = 4 Top-1 Acc. Top-5 Acc. Model Link
w/o Rearange 69.521 88.920 Pruned Model
Rearrange 69.579 88.944 Pruned Model
N = 8 Top-1 Acc. Top-5 Acc. Model Link
w/o Rearange 69.206 88.608 Pruned Model
Rearrange 69.372 88.862 Pruned Model
N = 16 Top-1 Acc. Top-5 Acc. Model Link
w/o Rearange 68.971 88.399 Pruned Model
Rearrange 69.352 88.708 Pruned Model
N = 32 Top-1 Acc. Top-5 Acc. Model Link
w/o Rearange 68.431 88.315 Pruned Model
Rearrange 68.762 88.425 Pruned Model

3)Encoding and Decoding Efficiency

Performance and latency comparison

Our sparse convolution implementation has been released to TVM community.

To verify the performance of our pruned models, convert onnx model and run the following command:

python model_tune.py \
--onnx_path [ONNX_MODEL_PATH] \
--bsr 4 \
--bsc 1 \
--sparsity 0.5

The detail tuning setting is referred to TVM.

4)Contact

Any problem regarding this code re-implementation, please contact the first author: [email protected] or the third author: [email protected].

Any problem regarding the sparse convolution implementation, please contact the second author: [email protected].

Owner
Mingbao Lin (林明宝)
I am currently a final-year Ph.D student.
Mingbao Lin (林明宝)
🥇Samsung AI Challenge 2021 1등 솔루션입니다🥇

MoT - Molecular Transformer Large-scale Pretraining for Molecular Property Prediction Samsung AI Challenge for Scientific Discovery This repository is

Jungwoo Park 44 Dec 03, 2022
JugLab 33 Dec 30, 2022
Make a surveillance camera from your raspberry pi!

rpi-surveillance Make a surveillance camera from your Raspberry Pi 4! The surveillance is built as following: the camera records 10 seconds video and

Vladyslav 62 Feb 03, 2022
A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

img_sussifier A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196 Examples How to use install python pip i

41 Sep 30, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

130 Dec 11, 2022
MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b

Living with Machines 25 Dec 26, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
A ssl analyzer which could analyzer target domain's certificate.

ssl_analyzer A ssl analyzer which could analyzer target domain's certificate. Analyze the domain name ssl certificate information according to the inp

vincent 17 Dec 12, 2022
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Kento Nishi 22 Jul 07, 2022
Gems & Holiday Package Prediction

Predictive_Modelling Gems & Holiday Package Prediction This project is based on 2 cases studies : Gems Price Prediction and Holiday Package prediction

Avnika Mehta 1 Jan 27, 2022
Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

QSORT QSORT(Quick + Simple Online and Realtime Tracking) is a simple online and realtime tracking algorithm for 2D multiple object tracking in video s

Yonghye Kwon 8 Jul 27, 2022
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021
Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

1 Feb 14, 2022
SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

SAAVN SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,IC

YinfengYu 10 Aug 30, 2022
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
Training neural models with structured signals.

Neural Structured Learning in TensorFlow Neural Structured Learning (NSL) is a new learning paradigm to train neural networks by leveraging structured

955 Jan 02, 2023
Automatically align face images 🙃→🙂. Can also do windowing and warping.

Automatic Face Alignment (AFA) Carl M. Gaspar & Oliver G.B. Garrod You have lots of photos of faces like this: But you want to line up all of the face

Carl Michael Gaspar 15 Dec 12, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Distributed Deep Learning in Open Collaborations This repository contains the code for the NeurIPS 2021 paper "Distributed Deep Learning in Open Colla

Yandex Research 96 Sep 15, 2022
Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Containerized Streamlit web app This repository is featured in a 3-part series on Deploying web apps with Streamlit, Docker, and AWS. Checkout the blo

Collin Prather 62 Jan 02, 2023