Dynamic Slimmable Network (CVPR 2021, Oral)

Overview

Dynamic Slimmable Network (DS-Net)

This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral).

image

Architecture of DS-Net. The width of each supernet stage is adjusted adaptively by the slimming ratio ρ predicted by the gate.

image

Accuracy vs. complexity on ImageNet.

Usage

1. Requirements

2. Stage I: Supernet Training

For example, train dynamic slimmable MobileNet supernet with 8 GPUs (takes about 2 days):

python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform.yml

3. Stage II: Gate Training

  • Will be available soon

Citation

If you use our code for your paper, please cite:

@inproceedings{li2021dynamic,
  author = {Changlin Li and
            Guangrun Wang and
            Bing Wang and
            Xiaodan Liang and
            Zhihui Li and
            Xiaojun Chang},
  title = {Dynamic Slimmable Network},
  booktitle = {CVPR},
  year = {2021}
}
Comments
  • The usage of gumbel softmax in DS-Net

    The usage of gumbel softmax in DS-Net

    Thank you for your very nice work,I want to know that the effect of gumble softmax,because I think the network can be trained without gumble softmax. Is the gumbel softmax just aimed to increase the randomness of channel choice?

    discussion 
    opened by LinyeLi60 7
  • UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    Why I get an warning: /home/chauncey/.local/lib/python3.8/site-packages/torchvision/transforms/functional.py:364: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( when I use python3 -m torch.distributed.launch --nproc_per_node=1 train.py ./imagenet -c ./configs/mobilenetv1_bn_uniform.yml

    opened by Chauncey-Wang 3
  • Question about calculating MAdds of dynamic network in the paper

    Question about calculating MAdds of dynamic network in the paper

    Thank you for your great work, and I have a question about how to calculate MAdds in your paper. The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks. Are they the average MAdds for the whole dataset?

    discussion 
    opened by sseung0703 3
  • why not set ensemble_ib to True?

    why not set ensemble_ib to True?

    Hi,

    I found that ensemble_ib is set to False for both slim training and gate training from the configs, but from paper it would boost the performance when set toTrue.

    Any idea?

    opened by twmht 2
  • MAdds of Pretrained Supernet

    MAdds of Pretrained Supernet

    Hi Changlin, your work is excellent. I have a question about the calculation of MAdds, in README.md the MAdds of Subnetwork 13 is 565M, but I think the MAdds of Subnetwork 13 should be 821M observed in my experiments, because the channel number of Subnetwork 13 is larger than the original MobileNetV1, and the original MobileNetV1 1.0's MAdds should be 565M. Looking forward to your reply.

    opened by LinyeLi60 2
  • Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    I follow your suggestion to set the num_choice in mobilenetv1_bn_uniform_reset_bn.yml to 14, but get an expected error when I use python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform_reset_bn.yml.

    08/25 10:15:57 AM Recalibrating BatchNorm statistics... 08/25 10:16:10 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:19 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:21 AM Test: [ 0/0] Mode: 0 Time: 0.344 (0.344) Loss: 6.9204 (6.9204) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 132890408 (132890408) 08/25 10:16:22 AM Test: [ 0/0] Mode: 1 Time: 0.406 (0.406) Loss: 6.9189 (6.9189) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 152917440 (152917440) 08/25 10:16:22 AM Test: [ 0/0] Mode: 2 Time: 0.381 (0.381) Loss: 6.9187 (6.9187) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 175152224 (175152224) 08/25 10:16:23 AM Test: [ 0/0] Mode: 3 Time: 0.389 (0.389) Loss: 6.9134 (6.9134) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 199594752 (199594752) Traceback (most recent call last): File "train.py", line 658, in main() File "train.py", line 635, in main eval_metrics.append(validate_slim(model, File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/apis/train_slim.py", line 215, in validate_slim output = model(input) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 191, in forward x = self.forward_features(x) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 178, in forward_features x = stage(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_stages.py", line 48, in forward x = self.first_block(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_blocks.py", line 240, in forward x = self.conv_pw(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_ops.py", line 94, in forward self.running_outc = self.out_channels_list[self.channel_choice] IndexError: list index out of range

    It looks like we should make some adjustment in other py files.

    opened by chaunceywx 2
  • Why the num_choice in different yml is different?

    Why the num_choice in different yml is different?

    Why you set num_choice in mobilenetv1_bn_uniform_reset_bn.yml as 4, but set this parameter as 14 in the other two yml file?

    老哥,如果你也是中国人,咱们还是用中文交流吧,我英语水平比较感人。。。

    opened by chaunceywx 2
  • 运行问题

    运行问题

    请问大佬下面这个问题是为什么 Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 1 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 0 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 0, total 2. 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 1, total 2. 01/21 05:42:20 AM Model slimmable_mbnet_v1_bn_uniform created, param count: 7676204 01/21 05:42:20 AM Data processing configuration for current model + dataset: 01/21 05:42:20 AM input_size: (3, 224, 224) 01/21 05:42:20 AM interpolation: bicubic 01/21 05:42:20 AM mean: (0.485, 0.456, 0.406) 01/21 05:42:20 AM std: (0.229, 0.224, 0.225) 01/21 05:42:20 AM crop_pct: 0.875 01/21 05:42:20 AM NVIDIA APEX not installed. AMP off. 01/21 05:42:21 AM Using torch DistributedDataParallel. Install NVIDIA Apex for Apex DDP. 01/21 05:42:21 AM Scheduled epochs: 40 01/21 05:42:21 AM Training folder does not exist at: images/train 01/21 05:42:21 AM Training folder does not exist at: images/train Killing subprocess 239 Killing subprocess 240 Traceback (most recent call last): File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in main() File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/root/anaconda3/envs/0108/bin/python', '-u', 'train.py', '--local_rank=1', 'images', '-c', './configs/mobilenetv1_bn_uniform_reset_bn.yml']' returned non-zero exit status 1.

    opened by 6imust 1
  • project environment

    project environment

    Hi,could you provide the environment for the project?I try to train the network with python=3.8 pytorch=1.7.1,cuda=10.2.Shortly after starting training,there's a RuntimeError: CUDA error: device-side assert triggered happened,and some other environment also lead to this error.I'm not sure whether the problem is caused by the difference of environment.

    opened by singularity97 1
  • Softmax twice for SGS loss?

    Softmax twice for SGS loss?

    Dear authors, thanks for this nice work.

    I wonder why the calculation of the SGS loss is using the softmaxed data rather than the logits, considering the PyTorch CrossEntropyLoss already contains a softmax inside.

    https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/apis/train_slim_gate.py#L98 https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L324-L355

    opened by Yu-Zhewen 0
  • Can we futher improve autoalim without gate?

    Can we futher improve autoalim without gate?

    It is not easy to deploy gate operator with some other backends, like TensorRT.

    So my question is can we futher improve autoalim without the dynamic gate when inference?Any ongoing work are doing this?

    opened by twmht 3
  • DS-Net for object detection

    DS-Net for object detection

    Hello. Thanks for your work. I noticed that you also conducted some experiments in object detection. I wonder whether or when you will release the code

    opened by NoLookDefense 8
  • Dynamic path for DS-mobilenet

    Dynamic path for DS-mobilenet

    Hi. Thanks for your work. I am reading your paper and trying to reimplement, and I feel confused about some details. You mentioned in your paper that the slimming ratio ρ∈[0.35 : 0.05 : 1.25], which have 18 paths. However, in your code, there are only 14 paths ρ∈[0.35 : 0.05 : 1] as mentioned in https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_net.py#L36 . And also, when conducting gate training, the gate function only has a 4-dimension output, meaning that there is only 4 paths and the slimming ratio is restricted to ρ∈[0.35 : 0.05 : 0.5]. https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L204 Why the dynamic path for larger network is not used?

    opened by NoLookDefense 1
Releases(v0.0.1)
  • v0.0.1(Nov 30, 2021)

    Pretrained weights of DS-MBNet supernet. Detailed accuracy of each sub-networks:

    | Subnetwork | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | ----------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | MAdds | 133M | 153M | 175M | 200M | 226M | 255M | 286M | 319M | 355M | 393M | 433M | 475M | 519M | 565M | | Top-1 (%) | 70.1 | 70.4 | 70.8 | 71.2 | 71.6 | 72.0 | 72.4 | 72.7 | 73.0 | 73.3 | 73.6 | 73.9 | 74.1 | 74.6 | | Top-5 (%) | 89.4 | 89.6 | 89.9 | 90.2 | 90.3 | 90.6 | 90.9 | 91.0 | 91.2 | 91.4 | 91.5 | 91.7 | 91.8 | 92.0 |

    Source code(tar.gz)
    Source code(zip)
    DS_MBNet-70_1.pth.tar(60.93 MB)
    log-DS_MBNet-70_1.txt(6.12 KB)
Owner
Changlin Li
Changlin Li
Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Rock Piano "When all is one and one is all, that's what it is to be a rock and not to roll." ---Led Zeppelin, "Stairway To Heaven" Proof-Of-Concept Pi

Alex 4 Nov 28, 2021
Iran Open Source Hackathon

Iran Open Source Hackathon is an open-source hackathon (duh) with the aim of encouraging participation in open-source contribution amongst Iranian dev

OSS Hackathon 121 Dec 25, 2022
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in the Wild"

Visual Attributes in the Wild (VAW) This repository provides data for the VAW dataset as described in the CVPR 2021 Paper: Learning to Predict Visual

Adobe Research 36 Dec 30, 2022
Training and Evaluation Code for Neural Volumes

Neural Volumes This repository contains training and evaluation code for the paper Neural Volumes. The method learns a 3D volumetric representation of

Meta Research 370 Dec 08, 2022
Original code for "Zero-Shot Domain Adaptation with a Physics Prior"

Zero-Shot Domain Adaptation with a Physics Prior [arXiv] [sup. material] - ICCV 2021 Oral paper, by Attila Lengyel, Sourav Garg, Michael Milford and J

Attila Lengyel 40 Dec 21, 2022
Neural Surface Maps

Neural Surface Maps Official implementation of Neural Surface Maps - Luca Morreale, Noam Aigerman, Vladimir Kim, Niloy J. Mitra [Paper] [Project Page]

Luca Morreale 49 Dec 13, 2022
COCO Style Dataset Generator GUI

A simple GUI-based COCO-style JSON Polygon masks' annotation tool to facilitate quick and efficient crowd-sourced generation of annotation masks and bounding boxes. Optionally, one could choose to us

Hans Krupakar 142 Dec 09, 2022
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

151 Dec 26, 2022
Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras which will then be used to generate residuals

Federico Lopez 2 Jan 14, 2022
Header-only library for using Keras models in C++.

frugally-deep Use Keras models in C++ with ease Table of contents Introduction Usage Performance Requirements and Installation FAQ Introduction Would

Tobias Hermann 927 Jan 05, 2023
A mini lib that implements several useful functions binding to PyTorch in C++.

Torch-gather A mini library that implements several useful functions binding to PyTorch in C++. What does gather do? Why do we need it? When dealing w

maxwellzh 8 Sep 07, 2022
FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

FairEdit Relevent Publication FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

5 Feb 04, 2022
GUI for a Vocal Remover that uses Deep Neural Networks.

GUI for a Vocal Remover that uses Deep Neural Networks.

4.4k Jan 07, 2023
Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

🎁 Exe To Office Executable file injection to Office documents: .xlsm, .docm, .p

3 Jan 25, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
Alex Pashevich 62 Dec 24, 2022
PyTorch-based framework for Deep Hedging

PFHedge: Deep Hedging in PyTorch PFHedge is a PyTorch-based framework for Deep Hedging. PFHedge Documentation Neural Network Architecture for Efficien

139 Dec 30, 2022
CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation

CDGAN CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation CDGAN Implementation in PyTorch This is the imple

Kancharagunta Kishan Babu 6 Apr 19, 2022
Code for "Causal autoregressive flows" - AISTATS, 2021

Code for "Causal Autoregressive Flow" This repository contains code to run and reproduce experiments presented in Causal Autoregressive Flows, present

Ricardo Pio Monti 35 Dec 16, 2022