Dynamic Slimmable Network (CVPR 2021, Oral)

Overview

Dynamic Slimmable Network (DS-Net)

This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral).

image

Architecture of DS-Net. The width of each supernet stage is adjusted adaptively by the slimming ratio ρ predicted by the gate.

image

Accuracy vs. complexity on ImageNet.

Usage

1. Requirements

2. Stage I: Supernet Training

For example, train dynamic slimmable MobileNet supernet with 8 GPUs (takes about 2 days):

python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform.yml

3. Stage II: Gate Training

  • Will be available soon

Citation

If you use our code for your paper, please cite:

@inproceedings{li2021dynamic,
  author = {Changlin Li and
            Guangrun Wang and
            Bing Wang and
            Xiaodan Liang and
            Zhihui Li and
            Xiaojun Chang},
  title = {Dynamic Slimmable Network},
  booktitle = {CVPR},
  year = {2021}
}
Comments
  • The usage of gumbel softmax in DS-Net

    The usage of gumbel softmax in DS-Net

    Thank you for your very nice work,I want to know that the effect of gumble softmax,because I think the network can be trained without gumble softmax. Is the gumbel softmax just aimed to increase the randomness of channel choice?

    discussion 
    opened by LinyeLi60 7
  • UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    Why I get an warning: /home/chauncey/.local/lib/python3.8/site-packages/torchvision/transforms/functional.py:364: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( when I use python3 -m torch.distributed.launch --nproc_per_node=1 train.py ./imagenet -c ./configs/mobilenetv1_bn_uniform.yml

    opened by Chauncey-Wang 3
  • Question about calculating MAdds of dynamic network in the paper

    Question about calculating MAdds of dynamic network in the paper

    Thank you for your great work, and I have a question about how to calculate MAdds in your paper. The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks. Are they the average MAdds for the whole dataset?

    discussion 
    opened by sseung0703 3
  • why not set ensemble_ib to True?

    why not set ensemble_ib to True?

    Hi,

    I found that ensemble_ib is set to False for both slim training and gate training from the configs, but from paper it would boost the performance when set toTrue.

    Any idea?

    opened by twmht 2
  • MAdds of Pretrained Supernet

    MAdds of Pretrained Supernet

    Hi Changlin, your work is excellent. I have a question about the calculation of MAdds, in README.md the MAdds of Subnetwork 13 is 565M, but I think the MAdds of Subnetwork 13 should be 821M observed in my experiments, because the channel number of Subnetwork 13 is larger than the original MobileNetV1, and the original MobileNetV1 1.0's MAdds should be 565M. Looking forward to your reply.

    opened by LinyeLi60 2
  • Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    I follow your suggestion to set the num_choice in mobilenetv1_bn_uniform_reset_bn.yml to 14, but get an expected error when I use python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform_reset_bn.yml.

    08/25 10:15:57 AM Recalibrating BatchNorm statistics... 08/25 10:16:10 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:19 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:21 AM Test: [ 0/0] Mode: 0 Time: 0.344 (0.344) Loss: 6.9204 (6.9204) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 132890408 (132890408) 08/25 10:16:22 AM Test: [ 0/0] Mode: 1 Time: 0.406 (0.406) Loss: 6.9189 (6.9189) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 152917440 (152917440) 08/25 10:16:22 AM Test: [ 0/0] Mode: 2 Time: 0.381 (0.381) Loss: 6.9187 (6.9187) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 175152224 (175152224) 08/25 10:16:23 AM Test: [ 0/0] Mode: 3 Time: 0.389 (0.389) Loss: 6.9134 (6.9134) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 199594752 (199594752) Traceback (most recent call last): File "train.py", line 658, in main() File "train.py", line 635, in main eval_metrics.append(validate_slim(model, File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/apis/train_slim.py", line 215, in validate_slim output = model(input) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 191, in forward x = self.forward_features(x) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 178, in forward_features x = stage(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_stages.py", line 48, in forward x = self.first_block(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_blocks.py", line 240, in forward x = self.conv_pw(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_ops.py", line 94, in forward self.running_outc = self.out_channels_list[self.channel_choice] IndexError: list index out of range

    It looks like we should make some adjustment in other py files.

    opened by chaunceywx 2
  • Why the num_choice in different yml is different?

    Why the num_choice in different yml is different?

    Why you set num_choice in mobilenetv1_bn_uniform_reset_bn.yml as 4, but set this parameter as 14 in the other two yml file?

    老哥,如果你也是中国人,咱们还是用中文交流吧,我英语水平比较感人。。。

    opened by chaunceywx 2
  • 运行问题

    运行问题

    请问大佬下面这个问题是为什么 Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 1 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 0 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 0, total 2. 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 1, total 2. 01/21 05:42:20 AM Model slimmable_mbnet_v1_bn_uniform created, param count: 7676204 01/21 05:42:20 AM Data processing configuration for current model + dataset: 01/21 05:42:20 AM input_size: (3, 224, 224) 01/21 05:42:20 AM interpolation: bicubic 01/21 05:42:20 AM mean: (0.485, 0.456, 0.406) 01/21 05:42:20 AM std: (0.229, 0.224, 0.225) 01/21 05:42:20 AM crop_pct: 0.875 01/21 05:42:20 AM NVIDIA APEX not installed. AMP off. 01/21 05:42:21 AM Using torch DistributedDataParallel. Install NVIDIA Apex for Apex DDP. 01/21 05:42:21 AM Scheduled epochs: 40 01/21 05:42:21 AM Training folder does not exist at: images/train 01/21 05:42:21 AM Training folder does not exist at: images/train Killing subprocess 239 Killing subprocess 240 Traceback (most recent call last): File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in main() File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/root/anaconda3/envs/0108/bin/python', '-u', 'train.py', '--local_rank=1', 'images', '-c', './configs/mobilenetv1_bn_uniform_reset_bn.yml']' returned non-zero exit status 1.

    opened by 6imust 1
  • project environment

    project environment

    Hi,could you provide the environment for the project?I try to train the network with python=3.8 pytorch=1.7.1,cuda=10.2.Shortly after starting training,there's a RuntimeError: CUDA error: device-side assert triggered happened,and some other environment also lead to this error.I'm not sure whether the problem is caused by the difference of environment.

    opened by singularity97 1
  • Softmax twice for SGS loss?

    Softmax twice for SGS loss?

    Dear authors, thanks for this nice work.

    I wonder why the calculation of the SGS loss is using the softmaxed data rather than the logits, considering the PyTorch CrossEntropyLoss already contains a softmax inside.

    https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/apis/train_slim_gate.py#L98 https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L324-L355

    opened by Yu-Zhewen 0
  • Can we futher improve autoalim without gate?

    Can we futher improve autoalim without gate?

    It is not easy to deploy gate operator with some other backends, like TensorRT.

    So my question is can we futher improve autoalim without the dynamic gate when inference?Any ongoing work are doing this?

    opened by twmht 3
  • DS-Net for object detection

    DS-Net for object detection

    Hello. Thanks for your work. I noticed that you also conducted some experiments in object detection. I wonder whether or when you will release the code

    opened by NoLookDefense 8
  • Dynamic path for DS-mobilenet

    Dynamic path for DS-mobilenet

    Hi. Thanks for your work. I am reading your paper and trying to reimplement, and I feel confused about some details. You mentioned in your paper that the slimming ratio ρ∈[0.35 : 0.05 : 1.25], which have 18 paths. However, in your code, there are only 14 paths ρ∈[0.35 : 0.05 : 1] as mentioned in https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_net.py#L36 . And also, when conducting gate training, the gate function only has a 4-dimension output, meaning that there is only 4 paths and the slimming ratio is restricted to ρ∈[0.35 : 0.05 : 0.5]. https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L204 Why the dynamic path for larger network is not used?

    opened by NoLookDefense 1
Releases(v0.0.1)
  • v0.0.1(Nov 30, 2021)

    Pretrained weights of DS-MBNet supernet. Detailed accuracy of each sub-networks:

    | Subnetwork | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | ----------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | MAdds | 133M | 153M | 175M | 200M | 226M | 255M | 286M | 319M | 355M | 393M | 433M | 475M | 519M | 565M | | Top-1 (%) | 70.1 | 70.4 | 70.8 | 71.2 | 71.6 | 72.0 | 72.4 | 72.7 | 73.0 | 73.3 | 73.6 | 73.9 | 74.1 | 74.6 | | Top-5 (%) | 89.4 | 89.6 | 89.9 | 90.2 | 90.3 | 90.6 | 90.9 | 91.0 | 91.2 | 91.4 | 91.5 | 91.7 | 91.8 | 92.0 |

    Source code(tar.gz)
    Source code(zip)
    DS_MBNet-70_1.pth.tar(60.93 MB)
    log-DS_MBNet-70_1.txt(6.12 KB)
Owner
Changlin Li
Changlin Li
DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022
Unofficial PyTorch implementation of SimCLR by Google Brain

Unofficial PyTorch implementation of SimCLR by Google Brain

Rishabh Anand 2 Oct 13, 2021
Deep ViT Features as Dense Visual Descriptors

dino-vit-features [paper] [project page] Official implementation of the paper "Deep ViT Features as Dense Visual Descriptors". We demonstrate the effe

Shir Amir 113 Dec 24, 2022
A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

img_sussifier A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196 Examples How to use install python pip i

41 Sep 30, 2022
StackNet is a computational, scalable and analytical Meta modelling framework

StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science

Marios Michailidis 1.3k Dec 15, 2022
Fast SHAP value computation for interpreting tree-based models

FastTreeSHAP FastTreeSHAP package is built based on the paper Fast TreeSHAP: Accelerating SHAP Value Computation for Trees published in NeurIPS 2021 X

LinkedIn 369 Jan 04, 2023
Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

FAC-Net Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng

21 Nov 22, 2022
Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN

Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN If you use this code for your research, please cite ou

41 Dec 08, 2022
A pytorch &keras implementation and demo of Fastformer.

Fastformer Notes from the authors Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The

153 Dec 28, 2022
PyJokes - Joking around with Python library pyjokes

Hi, it's Muhaimin again 👋 This is something unorthodox but cool. Don't forget t

Muhaimin A. Salay Kanton 1 Feb 02, 2022
PyTorch implementation of the WarpedGANSpace: Finding non-linear RBF paths in GAN latent space (ICCV 2021)

Authors official PyTorch implementation of the "WarpedGANSpace: Finding non-linear RBF paths in GAN latent space" [ICCV 2021].

Christos Tzelepis 100 Dec 06, 2022
Code for Active Learning at The ImageNet Scale.

Code for Active Learning at The ImageNet Scale. This repository implements many popular active learning algorithms and allows training with torch's DDP.

Zeyad Emam 47 Dec 12, 2022
Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs

PhyCRNet Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs Paper link: [ArXiv] By: Pu Ren, Chengping Rao, Yang

Pu Ren 11 Aug 23, 2022
Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Diverse Object-Scene Compositions For Zero-Shot Action Recognition This repository contains the source code for the use of object-scene compositions f

7 Sep 21, 2022
GitHub repository for "Improving Video Generation for Multi-functional Applications"

Improving Video Generation for Multi-functional Applications GitHub repository for "Improving Video Generation for Multi-functional Applications" Pape

Bernhard Kratzwald 328 Dec 07, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 06, 2023
source code of Adversarial Feedback Loop Paper

Adversarial Feedback Loop [ArXiv] [project page] Official repository of Adversarial Feedback Loop paper Firas Shama, Roey Mechrez, Alon Shoshan, Lihi

17 Jul 20, 2022
Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation".

I2V-GAN This repository is the official Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation". Traffic

69 Dec 31, 2022
A minimal implementation of Gaussian process regression in PyTorch

pytorch-minimal-gaussian-process In search of truth, simplicity is needed. There exist heavy-weighted libraries, but as you know, we need to go bare b

Sangwoong Yoon 38 Nov 25, 2022