Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Related tags

Deep LearningJCW
Overview

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

motivation

Abstract

For practical deep neural network design on mobile devices, it is essential to consider the constraints incurred by the computational resources and the inference latency in various applications. Among deep network acceleration related approaches, pruning is a widely adopted practice to balance the computational resource consumption and the accuracy, where unimportant connections can be removed either channel-wisely or randomly with a minimal impact on model accuracy. The channel pruning instantly results in a significant latency reduction, while the random weight pruning is more flexible to balance the latency and accuracy. In this paper, we present a unified framework with Joint Channel pruning and Weight pruning (JCW), and achieves a better Pareto-frontier between the latency and accuracy than previous model compression approaches. To fully optimize the trade-off between the latency and accuracy, we develop a tailored multi-objective evolutionary algorithm in the JCW framework, which enables one single search to obtain the optimal candidate architectures for various deployment requirements. Extensive experiments demonstrate that the JCW achieves a better trade-off between the latency and accuracy against various state-of-the-art pruning methods on the ImageNet classification dataset.

Framework

framework

Evaluation

Resnet18

Method Latency/ms Accuracy
Uniform 1x 537 69.8
DMCP 341 69.7
APS 363 70.3
JCW 160 69.2
194 69.7
196 69.9
224 70.2

MobileNetV1

Method Latency/ms Accuracy
Uniform 1x 167 70.9
Uniform 0.75x 102 68.4
Uniform 0.5x 53 64.4
AMC 94 70.7
Fast 61 68.4
AutoSlim 99 71.5
AutoSlim 55 67.9
USNet 102 69.5
USNet 53 64.2
JCW 31 69.1
39 69.9
43 69.8
54 70.3
69 71.4

MobileNetV2

Method Latency/ms Accuracy
Uniform 1x 114 71.8
Uniform 0.75x 71 69.8
Uniform 0.5x 41 65.4
APS 110 72.8
APS 64 69.0
DMCP 83 72.4
DMCP 45 67.0
DMCP 43 66.1
Fast 89 72.0
Fast 62 70.2
JCW 30 69.1
40 69.9
44 70.8
59 72.2

Requirements

  • torch
  • torchvision
  • numpy
  • scipy

Usage

The JCW works in a two-step fashion. i.e. the search step and the training step. The search step seaches for the layer-wise channel numbers and weight sparsity for Pareto-optimal models. The training steps trains the searched models with ADMM. We give a simple example for resnet18.

The search step

  1. Modify the configuration file

    First, open the file experiments/res18-search.yaml:

    vim experiments/res18-search.yaml

    Go to the 44th line and find the following codes:

    DATASET:
      data: ImageNet
      root: /path/to/imagenet
      ...
    

    and modify the root property of DATASET to the path of ImageNet dataset on your machine.

  2. Apply the search

    After modifying the configuration file, you can simply start the search by:

    python emo_search.py --config experiments/res18-search.yaml | tee experiments/res18-search.log

    After searching, the search results will be saved in experiments/search.pth

The training step

After searching, we can train the searched models by:

  1. Modify the base configuration file

    Open the file experiments/res18-train.yaml:

    vim experiments/res18-train.yaml

    Go to the 5th line, find the following codes:

    root: &root /path/to/imagenet
    

    and modify the root property to the path of ImageNet dataset on your machine.

  2. Generate configuration files for training

    After modifying the base configuration file, we are ready to generate the configuration files for training. To do that, simply run the following command:

    python scripts/generate_training_configs.py --base-config experiments/res18-train.yaml --search-result experiments/search.pth --output ./train-configs 

    After running the above command, the training configuration files will be written into ./train-configs/model-{id}/train.yaml.

  3. Apply the training

    After generating the configuration files, simply run the following command to train one certain model:

    python train.py --config xxxx/xxx/train.yaml | tee xxx/xxx/train.log
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022
An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

Rugby score prediction An end-to-end machine learning web app to predict rugby scores Overview An demo project to provide a high-level overview of the

34 May 24, 2022
Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020 BibTeX @INPROCEEDINGS{punnappurath2020modeling, author={Abhi

Abhijith Punnappurath 22 Oct 01, 2022
Weakly Supervised Segmentation by Tensorflow.

Weakly Supervised Segmentation by Tensorflow. Implements semantic segmentation in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

CHENG-YOU LU 52 Dec 27, 2022
Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

About Fuzzification Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-

gts3.org (<a href=[email protected])"> 55 Oct 25, 2022
A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

Yinqiong Cai 189 Dec 28, 2022
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

YOLOv5-Lite:lighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 05, 2023
learned_optimization: Training and evaluating learned optimizers in JAX

learned_optimization: Training and evaluating learned optimizers in JAX learned_optimization is a research codebase for training learned optimizers. I

Google 533 Dec 30, 2022
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 03, 2022
Facebook AI Image Similarity Challenge: Descriptor Track

Facebook AI Image Similarity Challenge: Descriptor Track This repository contains the code for our solution to the Facebook AI Image Similarity Challe

Sergio MP 17 Dec 14, 2022
Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization (Salesforce Research) This is a PyTorch implementation of the CoMatch paper [B

Salesforce 107 Dec 14, 2022
Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

Max 1 Dec 29, 2021
Blind visual quality assessment on 360° Video based on progressive learning

Blind visual quality assessment on omnidirectional or 360 video (ProVQA) Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and V

5 Jan 06, 2023
IDA file loader for UF2, created for the DEFCON 29 hardware badge

UF2 Loader for IDA The DEFCON 29 badge uses the UF2 bootloader, which conveniently allows you to dump and flash the firmware over USB as a mass storag

Kevin Colley 6 Feb 08, 2022
Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation

deeptime Releases: Installation via conda recommended. conda install -c conda-forge deeptime pip install deeptime Documentation: deeptime-ml.github.io

495 Dec 28, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022
Low-dose Digital Mammography with Deep Learning

Impact of loss functions on the performance of a deep neural network designed to restore low-dose digital mammography ====== This repository contains

WANG-AXIS 6 Dec 13, 2022
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

ETHZ V4RL 183 Dec 27, 2022
AAAI 2022: Stationary diffusion state neural estimation

Stationary Diffusion State Neural Estimation Although many graph-based clustering methods attempt to model the stationary diffusion state in their obj

绽琨 33 Nov 24, 2022