In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

Overview

In-Place Activated BatchNorm

In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

In-Place Activated BatchNorm (InPlace-ABN) is a novel approach to reduce the memory required for training deep networks. It allows for up to 50% memory savings in modern architectures such as ResNet, ResNeXt and Wider ResNet by redefining BN + non linear activation as a single in-place operation, while smartly dropping or recomputing intermediate buffers as needed.

This repository contains a PyTorch implementation of the InPlace-ABN layer, as well as some training scripts to reproduce the ImageNet classification results reported in our paper.

We have now also released the inference code for semantic segmentation, together with the Mapillary Vistas trained model leading to #1 position on the Mapillary Vistas Semantic Segmentation leaderboard. More information can be found at the bottom of this page.

Citation

If you use In-Place Activated BatchNorm in your research, please cite:

@inproceedings{rotabulo2017place,
  title={In-Place Activated BatchNorm for Memory-Optimized Training of DNNs},
  author={Rota Bul\`o, Samuel and Porzi, Lorenzo and Kontschieder, Peter},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Overview

When processing a BN-Activation-Convolution sequence in the forward pass, most deep learning frameworks need to store two big buffers, i.e. the input x of BN and the input z of Conv. This is necessary because the standard implementations of the backward passes of BN and Conv depend on their inputs to calculate the gradients. Using Inplace-ABN to replace the BN-Activation sequence, we can safely discard x, thus saving up to 50% GPU memory at training time. To achieve this, we rewrite the backward pass of BN in terms of its output y, which is in turn reconstructed from z by inverting the activation function.

The parametrization for the scaling factor of BN changed compared to standard BN, in order to ensure an invertible transformation. Specifically, the scaling factor becomes .

Requirements

To install PyTorch, please refer to https://github.com/pytorch/pytorch#installation.

NOTE 1: our code requires PyTorch v1.1 or later

NOTE 2: we are only able to provide support for Linux platforms and CUDA versions >= 10.0

NOTE 3: in general, it is not possible to load weights from a network trained with standard BN into an InPlace-ABN network without severe performance degradation, due to the different handling of BN scaling parameters

To install the package containing the iABN layers:

pip install inplace-abn

Note that some parts of InPlace-ABN have native C++/CUDA implementations, meaning that the command above will need to compile them.

Alternatively, to download and install the latest version of our library, also obtaining a copy of the Imagenet / Vistas scripts:

git clone https://github.com/mapillary/inplace_abn.git
cd inplace_abn
python setup.py install
cd scripts
pip install -r requirements.txt

The last of the commands above will install some additional libraries required by the Imagenet / Vistas scripts.

Force compiling with CUDA

In order to force the compilation of the native CUDA functions on systems that do not have access to a GPU (e.g. Docker containers), two environment variables have to be set:

export TORCH_CUDA_ARCH_LIST="{archs}"
export IABN_FORCE_CUDA=1

where {archs} is a list of target CUDA architectures, e.g. Pascal;Volta, 6.0;6.5 etc.

Training on ImageNet-1k

Here you can find the results from our arXiv paper (top-1 / top-5 scores) with corresponding, trained models and md5 checksums, respectively. The model files provided below are made available under the license attached to ImageNet.

Network Batch 224 224, 10-crops 320 Trained models (+md5)
ResNeXt101, Std-BN 256 77.04 / 93.50 78.72 / 94.47 77.92 / 94.28 448438885986d14db5e870b95f814f91
ResNeXt101, InPlace-ABN 512 78.08 / 93.79 79.52 / 94.66 79.38 / 94.67 3b7a221cbc076410eb12c8dd361b7e4e
ResNeXt152, InPlace-ABN 256 78.28 / 94.04 79.73 / 94.82 79.56 / 94.67 2c8d572587961ed74611d534c5b2e9ce
WideResNet38, InPlace-ABN 256 79.72 / 94.78 81.03 / 95.43 80.69 / 95.27 1c085ab70b789cc1d6c1594f7a761007
ResNeXt101, InPlace-ABN sync 256 77.70 / 93.78 79.18 / 94.60 78.98 / 94.56 0a85a21847b15e5a242e17bf3b753849
DenseNet264, InPlace-ABN 256 78.57 / 94.17 79.72 / 94.93 79.49 / 94.89 0b413d67b725619441d0646d663865bf
ResNet50v1, InPlace-ABN sync 512 75.53 / 92.59 77.04 / 93.57 76.60 / 93.49 2522ca639f7fdfd7c0089ba1f5f6c2e8
ResNet34v1, InPlace-ABN sync 512 73.27 / 91.34 75.19 / 92.66 74.87 / 92.42 61515c1484911c3cc753d405131e1dda
ResNet101v1, InPlace-ABN sync 512 77.07 / 93.45 78.58 / 94.40 78.25 / 94.19 1552ae0f3d610108df702135f56bd27b

Data preparation

Our script uses torchvision.datasets.ImageFolder for loading ImageNet data, which expects folders organized as follows:

root/train/[class_id1]/xxx.{jpg,png,jpeg}
root/train/[class_id1]/xxy.{jpg,png,jpeg}
root/train/[class_id2]/xxz.{jpg,png,jpeg}
...

root/val/[class_id1]/asdas.{jpg,png,jpeg}
root/val/[class_id1]/123456.{jpg,png,jpeg}
root/val/[class_id2]/__32_.{jpg,png,jpeg}
...

Images can have any name, as long as the extension is that of a recognized image format. Class ids are also free-form, but they are expected to match between train and validation data. Note that the training data in the standard ImageNet distribution is already given in the required format, while validation images need to be split into class sub-folders as described above.

Training

The main training script is scripts/train_imagenet.py: this supports training on ImageNet, or any other dataset formatted as described above, while keeping a log of relevant metrics in Tensorboard format and periodically saving snapshots. Most training parameters can be specified as a json-formatted configuration file (look here for a complete list of configurable parameters). All parameters not explicitly specified in the configuration file are set to their defaults, also available in scripts/imagenet/config.py.

Our arXiv results can be reproduced by running scripts/train_imagenet.py with the configuration files in scripts/experiments. As an example, the command to train ResNeXt101 with InPlace-ABN, Leaky ReLU and batch_size = 512 is:

cd scripts
python -m torch.distributed.launch --nproc_per_node <n. GPUs per node> train_imagenet.py --log-dir /path/to/tensorboard/logs experiments/resnext101_ipabn_lr_512.json /path/to/imagenet/root

Validation

Validation is run by scripts/train_imagenet.py at the end of every training epoch. To validate a trained model, you can use the scripts/test_imagenet.py script, which allows for 10-crops validation and transferring weights across compatible networks (e.g. from ResNeXt101 with ReLU to ResNeXt101 with Leaky ReLU). This script accepts the same configuration files as scripts/train_imagenet.py, but note that the scale_val and crop_val parameters are ignored in favour of the --scale and --crop command-line arguments.

As an example, to validate the ResNeXt101 trained above using 10-crops of size 224 from images scaled to 256 pixels, you can run:

cd scripts
python -m torch.distributed.launch --nproc_per_node <n. GPUs per node> test_imagenet.py --crop 224 --scale 256 --ten_crops experiments/resnext101_ipabn_lr_512.json /path/to/checkpoint /path/to/imagenet/root

Usage for Semantic Segmentation on Cityscapes and Mapillary Vistas

We have successfully used InPlace-ABN with a DeepLab3 segmentation head that was trained on top of the WideResNet38 model above. Due to InPlace-ABN, we can significantly increase the amount of input data to this model, which eventually allowed us to obtain #1 positions on Cityscapes, Mapillary Vistas, AutoNUE, Kitti and ScanNet segmentation leaderboards. The training settings mostly follow the description in our paper.

Mapillary Vistas pre-trained model

We release our WideResNet38 + DeepLab3 segmentation model trained on the Mapillary Vistas research set. This is the model used to reach #1 position on the MVD semantic segmentation leaderboard. The segmentation model file provided below is made available under a CC BY-NC-SA 4.0 license.

Network mIOU Trained model (+md5)
WideResNet38 + DeepLab3 53.42 913f78486a34aa1577a7cd295e8a33bb

To use this, please download the .pth.tar model file linked above and run the test_vistas.py script as follows:

cd scripts
python test_vistas.py /path/to/model.pth.tar /path/to/input/folder /path/to/output/folder

The script will process all .png, .jpg and .jpeg images from the input folder and write the predictions in the output folder as .png images. For additional options, e.g. test time augmentation, please consult the script's help message.

The results on the test data written above were obtained by employing only scale 1.0 + flipping.

Changelog

Update 04 Jul. 2019: version 1.0.0

  • Complete rewrite of the CUDA code following the most recent native BN implementation from Pytorch
  • Improved synchronized BN implementation, correctly handling different per-GPU batch sizes and Pytorch distributed groups
  • The iABN layers are now packaged in an installable python library to simplify use in other projects
  • The Imagenet / Vistas scripts are still available in the scripts folder
  • Requires now PyTorch 1.1

Update 08 Jan. 2019:

  • Enabled multiprocessing and inplace ABN synchronization over multiple processes (previously using threads). It now requires to use DistributedDataParallel instead of DataParallel
  • Added compatibility with fp16 (currently allows fp16 input but requires the module to stay in fp32 mode)
  • Requires now PyTorch 1.0

Update Feb. 2019:

  • Added ResNet34v1, ResNet50v1 and ResNet101v1 ImageNet-1k pre-trained models

We have modified the imagenet training code and BN synchronization in order to work with multiple processes. We have also added compatibility of our Inplace ABN module with fp16.

Comments
  • RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    I try to use the ABN, InPlaceABN, InPlaceABNSync. But some errors occur.

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    I test it on Pytorch-0.2, cudnnv7, cuda8.

    opened by mingminzhen 37
  • error for test_vistas.py

    error for test_vistas.py

    I try the test_vistas.py, something is wrong.

    Traceback (most recent call last):
      File "test_vistas_single_gpu.py", line 311, in <module>
        main()
      File "test_vistas_single_gpu.py", line 188, in main
        probs, preds = model(img, scales, args.flip)
      File "/home/mingmin/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
        result = self.forward(*input, **kwargs)
      File "test_vistas_single_gpu.py", line 135, in forward
        sem_logits = self._network(x, scale)
      File "test_vistas_single_gpu.py", line 117, in _network
        x_up = functional.upsample(x, size=scaled_size, mode="bilinear")
      File "/home/mingmin/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py", line 1891, in upsample
        return interpolate(input, size, scale_factor, mode, align_corners)
      File "/home/mingmin/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py", line 1985, in interpolate
        return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
    TypeError: 'float' object cannot be interpreted as an integer
    

    Then i modify the code in the function _network(self, x, scale): scaled_size = [s * scale for s in x.shape[-2:]] to scaled_size = [int(s * scale) for s in x.shape[-2:]] No error occurs. But the output image is wrong. No right segmentation result. Can you help check the segmentation model and code ?

    opened by mingminzhen 27
  • Performance drop when replacing the old version of inplace with pytorch 1.0 version

    Performance drop when replacing the old version of inplace with pytorch 1.0 version

    The performance of the above experiments (deeplab v3) dropped to 74% by using the new pytorch1.0 version of inplace. Is there anything need to be careful when replacing the old version of inplace with pytorch 1.0 version? I only copied the files in ./modules to ./libs in https://github.com/speedinghzl/pytorch-segmentation-toolbox

    Originally posted by @lzrobots in https://github.com/mapillary/inplace_abn/issues/15#issuecomment-458220361

    opened by rotabulo 21
  • Loading the state dict of a model without InPlaceABN

    Loading the state dict of a model without InPlaceABN

    Hi.

    First of all, this package is amazing, and provided an incredible speed boost to training all of my models, so thanks.

    Now, I am trying to convert a standard pretrained model (like resnet50) to InPlaceABN model. Basically, I would like to be able to take any model, and apply on it a function that would convert all the BatchNorm2d with a InPlaceABN and copy all of the parameters.

    I have written a the following script. It is working if I use ABN but it is not working when I use InPlaceABN.

    Basically, I load a pretrain ResNet50 model, and apply the function to_abn on it. If I replace every nn.BatchNorm2d with ABN, I can infer on my test set and get the test accuracy I am supposed to. If I use InPlaceABN in the below code, I get 0% accuracy, as if InPlaceABN was not able to load a state_dict from a regular nn.BatchNorm2d. Any idea where it comes from?

    from inplace_abn import InPlaceABN, ABN
    import torch.nn as nn
    from utils import set_layer
    
    
    
    def to_abn(module):
        if hasattr(module, 'module'):
            module = module.module
        for n, m in module.named_modules():
            if isinstance(m, nn.BatchNorm2d):
                num_features = m.num_features
                momentum = m.momentum
                eps = m.eps
                # The below line does not seems to work when I try to load the state_dict
                new_bn = InPlaceABN(num_features=num_features, eps=eps, momentum=momentum, activation='identity')
                # But the below line would work and provide same results as original network. 
                # new_bn = ABN(num_features=num_features, eps=eps, momentum=momentum, activation='identity')
                new_bn.load_state_dict(m.state_dict())
                set_layer(module, n, new_bn)
        return module
    
    

    the function set_layer simply replaces a module of the model with another module.

    Thanks for your help

    opened by yoniaflalo 15
  • Command '['ninja', '-v']' returned non-zero exit status 1

    Command '['ninja', '-v']' returned non-zero exit status 1

    Traceback (most recent call last): File "/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 576, in _build_extension_module ['ninja', '-v'], stderr=subprocess.STDOUT, cwd=build_directory) File "/home/peng/anaconda2/envs/python36/lib/python3.6/subprocess.py", line 356, in check_output **kwargs).stdout File "/home/peng/anaconda2/envs/python36/lib/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "train.py", line 35, in from network import Network File "/home/peng/pytorch-seg-new/experiments/SimpleMerge_NYU/network.py", line 20, in import resnet101_dilation File "../../basemodel/resnet101_dilation.py", line 18, in from LibInplaceABN.modules import InPlaceABNSync, ABN, GlobalAvgPool2d File "../../lib/LibInplaceABN/modules/init.py", line 1, in from .bn import ABN, InPlaceABN, InPlaceABNSync File "../../lib/LibInplaceABN/modules/bn.py", line 10, in from .functions import * File "../../lib/LibInplaceABN/modules/functions.py", line 18, in extra_cuda_cflags=["--expt-extended-lambda"]) File "/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 501, in load _build_extension_module(name, build_directory) File "/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 582, in _build_extension_module name, error.output.decode())) RuntimeError: Error building extension 'inplace_abn': [1/5] /usr/local/cuda-9.1/bin/bin/nvcc -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m --compiler-options '-fPIC' --expt-extended-lambda -std=c++11 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn_cuda.cu -o inplace_abn_cuda.cuda.o FAILED: inplace_abn_cuda.cuda.o /usr/local/cuda-9.1/bin/bin/nvcc -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m --compiler-options '-fPIC' --expt-extended-lambda -std=c++11 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn_cuda.cu -o inplace_abn_cuda.cuda.o /bin/sh: 1: /usr/local/cuda-9.1/bin/bin/nvcc: not found [2/5] /usr/local/cuda-9.1/bin/bin/nvcc -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m --compiler-options '-fPIC' --expt-extended-lambda -std=c++11 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn_cuda_half.cu -o inplace_abn_cuda_half.cuda.o FAILED: inplace_abn_cuda_half.cuda.o /usr/local/cuda-9.1/bin/bin/nvcc -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m --compiler-options '-fPIC' --expt-extended-lambda -std=c++11 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn_cuda_half.cu -o inplace_abn_cuda_half.cuda.o /bin/sh: 1: /usr/local/cuda-9.1/bin/bin/nvcc: not found [3/5] c++ -MMD -MF inplace_abn.o.d -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m -fPIC -std=c++11 -O3 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn.cpp -o inplace_abn.o FAILED: inplace_abn.o c++ -MMD -MF inplace_abn.o.d -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m -fPIC -std=c++11 -O3 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn.cpp -o inplace_abn.o /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn.cpp:1:29: fatal error: torch/extension.h: No such file or directory compilation terminated. [4/5] c++ -MMD -MF inplace_abn_cpu.o.d -DTORCH_EXTENSION_NAME=inplace_abn -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/peng/anaconda2/envs/python36/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.1/bin/include -I/home/peng/anaconda2/envs/python36/include/python3.6m -fPIC -std=c++11 -O3 -c /home/peng/pytorch-seg-new/lib/LibInplaceABN/modules/src/inplace_abn_cpu.cpp -o inplace_abn_cpu.o ninja: build stopped: subcommand failed.

    opened by zsp1993 14
  • gloo error

    gloo error

    Hi, I used distributed train with 'gloo' method but got error:

    File "/home/yuxb/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph)
      File "/home/yuxb/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
        allow_unreachable=True)  # allow_unreachable flag
      File "/home/yuxb/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 445, in distributed_data_parallel_hook
        self._queue_reduction(bucket_idx)
      File "/home/yuxb/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 475, in _queue_reduction
        self.device_ids)
    TypeError: _queue_reduction(): incompatible function arguments. The following argument types are supported:
        1. (process_group: torch.distributed.ProcessGroup, grads_batch: List[List[at::Tensor]], devices: List[int]) -> Tuple[torch.distributed.Work, at::Tensor]
    
    Invoked with: <torch.distributed.ProcessGroupGloo object at 0x7fdf2daaadc0>, [[tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0..
    

    Thanks for your help.

    opened by JamesKasperYu 13
  • How to train the ResNext model on the Cityscapes?

    How to train the ResNext model on the Cityscapes?

    I used to train the PSP with the original ResNet-101(7x7 convolution) and achieve 76.8 mIou on the val set of Cityscapes. Thus I want to replace the ResNet-101 with the ResNext-101(the imagenet pretrained model you provide) and train it with the same parameters, but the performance seems very poor(66.4). Do you use the COCO to finetune the ResNext model?

    Could you give me some tips to tune the hyperparameters for the ResNext-101 compared with ResNet-101?

    Besides, I am wondering whether have you trained the modified ResNet-101(replace the 7X7 conv with three 3x3 conv) model on the ImageNet? If so, it would be great if you could share me this model.

    invalid question 
    opened by PkuRainBow 13
  • About general activation function

    About general activation function

    Great thanks for inplace_abn, it increase batch size by 1.5 in my application!

    I notice that InPlaceABN support no activatation function, relu, elu, leaky relu now. May I ask how to implement other activation function like prelu?

    Can we just use [InPlaceABN(depth, activation='none'), torch.nn.PReLU()] , since PReLU seems not inplace operation?

    opened by luzai 12
  •  error: An extended __device__ lambda must not be defined in a function that is defined within another function

    error: An extended __device__ lambda must not be defined in a function that is defined within another function

    Hi, I find that you have updated your repo and I download the new repo.

    I try to build the repo under such enviroments:

    Ubuntu 16.04(docker)
    Miniconda env: pytorch04
    python: 3.6.5
    cuda: 8.0
    cudnn: 7.0
    

    I meet such errors: image

    enhancement 
    opened by PkuRainBow 12
  • Can't install by download file

    Can't install by download file

    root:/data/research/seamseg-master# pip install inplace_abn-master.zip Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Processing ./inplace_abn-master.zip ERROR: Command errored out with exit status 1: command: /data/config/anaconda3/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-1w23bkoz/setup.py'"'"'; file='"'"'/tmp/pip-req-build-1w23bkoz/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base pip-egg-info cwd: /tmp/pip-req-build-1w23bkoz/ Complete output (20 lines): /data/config/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type' warnings.warn(msg) Traceback (most recent call last): File "", line 1, in File "/tmp/pip-req-build-1w23bkoz/setup.py", line 59, in cmdclass={"build_ext": BuildExtension} File "/data/config/anaconda3/lib/python3.6/distutils/core.py", line 108, in setup _setup_distribution = dist = klass(attrs) File "/data/config/anaconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 318, in init File "/data/config/anaconda3/lib/python3.6/distutils/dist.py", line 281, in init self.finalize_options() File "/data/config/anaconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 375, in finalize_options File "/tmp/pip-req-build-1w23bkoz/.eggs/setuptools_scm-3.3.3-py3.6.egg/setuptools_scm/integration.py", line 17, in version_keyword File "/tmp/pip-req-build-1w23bkoz/.eggs/setuptools_scm-3.3.3-py3.6.egg/setuptools_scm/init.py", line 150, in get_version File "/tmp/pip-req-build-1w23bkoz/.eggs/setuptools_scm-3.3.3-py3.6.egg/setuptools_scm/init.py", line 113, in _do_parse LookupError: setuptools-scm was unable to detect version for '/tmp/pip-req-build-1w23bkoz'.

    Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.
    
    For example, if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj
    ----------------------------------------
    

    ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

    opened by jianlong-yuan 11
  • Have you noticed a big unbalance between gpus with InplaceSyncBN?

    Have you noticed a big unbalance between gpus with InplaceSyncBN?

    With pytorch nn.BatchNorm2d and nn.DataParallel, the memory difference between two gpus is less than 1G. However, with InplaceSyncBN and nn.DataParallel, the memory gap is enlarged to almost 2G. What is the cause of this and how could I avoid it?

    opened by CoinCheung 11
  • Support for Pillow >= 8.3.0

    Support for Pillow >= 8.3.0

    As stated in issue 229, I contribute the implementation to check Pillow version. If the version of Pillow >= 8.3.0, the _PALETTE list must consist of all channels for one color followed by the next color (e.g. RGBRGBRGB). Otherwise, all R values must be contiguous in the list before G and B values.

    opened by gyes00205 0
  • Support for Pillow >= 8.3.0

    Support for Pillow >= 8.3.0

    I try to test picture as below shown but get a strange result not like the result of issue 49.

    Therefore, I try to find the bug of test_vistas_single_gpu.py. In line 320 of test_vistas_single_gpu.py, all R values is contiguous in the list before G and B values.

    _PALETTE = ImagePalette.ImagePalette(
        palette=list(_PALETTE[:, 0]) + list(_PALETTE[:, 1]) + list(_PALETTE[:, 2]),
        mode="RGB",
    )
    

    But in Pillow >= 8.3.0, the list must consist of all channels for one color followed by the next color (e.g. RGBRGBRGB). We can see the description in Pillow 8.3.x version Maybe I can contribute the code to check the Pillow version in test_vistas_single_gpu.py.

    opened by gyes00205 0
  • inplace_abn does not seem to support Pytorch 1.12.1

    inplace_abn does not seem to support Pytorch 1.12.1

    I'm getting the following error when running a training job that was working fine in Pytorch 1.11 with 1.12:

      File "/pip-dl_inplace_abn/inplace_abn/functions.py", line 227, in inplace_abn
        return InPlaceABN.apply(
      File "/pip-dl_inplace_abn/inplace_abn/functions.py", line 86, in forward
        mean, var, count = _backend.statistics(x)
    RuntimeError: Tensors of type TensorImpl do not have sizes
    

    Is inplace_abn not compatible with Pytorch 1.12 or is the problem somewhere else?

    opened by gdippolito 0
  • inplace_abn for transformer

    inplace_abn for transformer

    dear author, the transformer use the layer norm in stead of batch norm, is it possible to apply inplace abn to transformer-based models? or is there any way to lower those models' gpu memory? thanks.

    opened by 17dacheng 0
  • setup.py can not work

    setup.py can not work

    LookupError: setuptools-scm was unable to detect version for /home/pyl/pythonproject_pyl/inplace_abn_main.

    Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.

    opened by pyl3000 1
  • Import issue: Undefined symbol

    Import issue: Undefined symbol

    When attempting to import Inplace-ABN version 1.1.0, I get the following error:

    >>> from inplace_abn import InPlaceABN
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/curttigges/miniconda3/envs/pytorch-dl/lib/python3.7/site-packages/inplace_abn/__init__.py", line 1, in <module>
        from .abn import ABN, InPlaceABN, InPlaceABNSync
      File "/home/curttigges/miniconda3/envs/pytorch-dl/lib/python3.7/site-packages/inplace_abn/abn.py", line 8, in <module>
        from .functions import inplace_abn, inplace_abn_sync
      File "/home/curttigges/miniconda3/envs/pytorch-dl/lib/python3.7/site-packages/inplace_abn/functions.py", line 8, in <module>
        from . import _backend
    ImportError: /home/curttigges/miniconda3/envs/pytorch-dl/lib/python3.7/site-packages/inplace_abn/_backend.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr9_M_addrefEv
    

    I am using CUDA 11.2 and PyTorch 1.8.2. The Inplace-ABN files were compiled with GCC 10.

    opened by curt-tigges 1
Releases(v1.1.0)
  • v1.1.0(Sep 3, 2020)

    This release updates ABN, InPlaceABN and InPlaceABNSync to feature parity with recent versions of Pytorch's BatchNormNd layers:

    • Add a track_running_stats parameter to enable / disable computation of running statistics independently from the layer's training state
    • Add a num_batches_tracked buffer, and allow passing momentum=None to support cumulative moving average for tracking running stats instead of exponential moving average
    • As a side-effect, now support loading parameters from standard BatchNorm without work-arounds. Still, if the loaded parameters contain negative weight elements the output will differ compared to standard BatchNorm

    Additional changes:

    • Fix backward pass in eval mode: it was not properly accounting for the activation function
    • Refactor code to follow more sensible formatting standards
    • Add type annotations
    • Improve docstrings
    • Update installation instructions, pointing to the PyPI package
    Source code(tar.gz)
    Source code(zip)
  • v1.0.12(Apr 22, 2020)

  • v1.0.11(Jan 27, 2020)

  • v1.0.10(Jan 8, 2020)

    This release contains an improved implementation of the fix for the backward pass in v1.0.9 which uses less temporary memory at no additional computational cost.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.9(Jan 7, 2020)

    In previous versions, both the input/output tensor y and the gradient tensor dy were overwritten during the backward pass. This was causing issues with some network topologies, producing wrong gradients.

    To fix this issue, a pair of temporary tensors is now created during the backward pass to hold the results of intermediate computations. This change will increase the amount of temporary memory required, meaning that in some cases where GPU memory utilization was already very close to the limit OOM errors might now occur. An alternative, more complex fix is also possible at the expense of additional computational costs. We are evaluating the impact of these changes and will provide updates in a future release.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.7(Sep 4, 2019)

  • v1.0.6(Aug 23, 2019)

    At compile time, when determining whether to enable CUDA support, we now base the decision on the Pytorch version installed:

    • If a CUDA-enabled Pytorch is detected, we attempt to compile CUDA support
    • If a CPU-only Pytorch is detected, we disable CUDA support
    Source code(tar.gz)
    Source code(zip)
  • v1.0.5(Aug 20, 2019)

    InPlace-ABN can now be compiled and used without CUDA. Note that Synchronized InPlace-ABN is still only supported in conjunction with CUDA-enabled Pytorch.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.4(Aug 14, 2019)

    State dicts from standard BatchNorm layers trained with Pytorch v1.0.0 or newer can now be properly loaded by ABN, InPlaceABN and InPlaceABNSync.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.3(Jul 16, 2019)

    Added a couple of functions to manage distributed groups with InplaceABNSync:

    • active_group: create a distributed group where each worker can decide wether to participate or not.
    • set_active_group: scan a model, passing a distributed group to all layers that implement a set_group() method.

    These are intended to simplify handling of asymmetric computational graphs in DistributedDataParallel when using InplaceABNSync. A typical usage is as follows:

    class DynamicModel(nn.Module):
        def __init__(self):
            super(DynamicModel, self).__init__()
            self.conv1 = nn.Conv2d(4, 4, 1)
            self.bn1 = InplaceABNSync(4)
            self.conv2 = nn.Conv2d(4, 4, 1)
            self.bn2 = InplaceABNSync(4)
        
        def forward(x):
            x = self.conv1(x)
            x = self.bn1(x)
            
            # Call some data-dependent function telling us wether the second part of the network
            # should be traversed or not
            active = self.get_active(x)
            
            # Create process group containing only the active workers, pass it to bn2
            set_active_group(self.bn2, active_group(active))
            
            # Run the second part of the network only if active is True
            if active:
                x = self.conv2(x)
                x = self.bn2(x)
            
            return x
    
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Jul 5, 2019)

    This update adds back support for mixed precision training. These combinations of inputs / parameters are now supported:

    • float32 input, float32 weight and bias
    • float64 input, float64 weight and bias
    • float16 input, float16 weight and bias
    • float16 input, float32 weight and bias

    Note: in the float16 cases all internal operations are still performed with float32 math, and float16 is not supported when operating in CPU mode.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Jul 4, 2019)

    This release marks some major changes in inplace_abn:

    • Complete rewrite of the CUDA code following the most recent native BN implementation from Pytorch
    • Improved synchronized BN implementation, correctly handling different per-GPU batch sizes and Pytorch distributed groups
    • The iABN layers are now packaged in an installable python library to simplify use in other projects
    • The Imagenet / Vistas scripts are still available in the scripts folder
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Feb 14, 2019)

    We added the possibility of training ResNet with inplace ABN layers.

    In addition we released ResNet34 and ResNet50 pre-trained on ImageNet.

    Source code(tar.gz)
    Source code(zip)
  • v0.1(Jan 8, 2019)

    This is a code refactoring to enable compatibility with Pytorch v1.0.

    Additional changes:

    • Moved from multi-threading training to distributed training using multiple processes
    • We provide an adapted implementation of synchronized inplace ABN
    • Our inplace ABN layer is compatible with fp16 tensors.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.3(Jan 7, 2019)

    This is a partial code refactoring to enable compatibility with Pytorch v0.4.1. In particular:

    • Fixed compatibility with pytorch>=0.4.1 due to change of AT_ASSERT
    • Fixed GPU allocation of tensors created in CUDA code

    Additional changes:

    • Added segmentation models and scripts to run inference on Vistas
    • Updated license
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Jul 18, 2018)

    This is a partial code refactoring to enable compatibility with Pytorch v0.4. In particular:

    • Native functions have been rewritten to use the new ATen-based extension interface introduced in v0.4. As a side effect, the native code doesn't need to be pre-compiled anymore. Instead, we are now using Pytorch's newly introduced run-time library loading mechanism.
    • The python code has been modified to account for the fact that autograd.Variable does not exist anymore.

    Additional changes:

    • ABN modules have been slightly refactored, leading to a slight change in the structure of the overall models' state_dicts. As a consequence, pre-trained models need to be re-downloaded (updated links in README.md).
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Jul 17, 2018)

    NOTE: this is the last release that is compatible with Pytorch v0.3

    After this release, the code will undergo partial rewrite to adapt to the changes introduced in Pytorch v0.4 regarding Tensors / Variables and native functions. As a consequence, we are completely dropping support for versions of Pytorch before v0.3.

    Source code(tar.gz)
    Source code(zip)
Owner
Map data at scale from street-level imagery
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Wenwen Yu 498 Dec 24, 2022
Robustness via Cross-Domain Ensembles

Robustness via Cross-Domain Ensembles [ICCV 2021, Oral] This repository contains tools for training and evaluating: Pretrained models Demo code Traini

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 27 Dec 23, 2022
Learning Features with Parameter-Free Layers (ICLR 2022)

Learning Features with Parameter-Free Layers (ICLR 2022) Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper NAVER AI Lab, NAVER CLOVA Up

NAVER AI 65 Dec 07, 2022
This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust.

Demo BERT ONNX pipeline written in rust This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust. R

Xavier Tao 14 Dec 17, 2022
Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

Vision and Learning Group 243 Jan 09, 2023
Official repository for the paper "GN-Transformer: Fusing AST and Source Code information in Graph Networks".

GN-Transformer AST This is the official repository for the paper "GN-Transformer: Fusing AST and Source Code information in Graph Networks". Data Prep

Cheng Jun-Yan 10 Nov 26, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 08, 2022
Anagram Generator in Python

Anagrams Generator This is a program for computing multiword anagrams. It makes no effort to come up with sentences that make sense; it only finds ana

Day Fundora 5 Nov 17, 2022
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

Bogireddy Sai Prasanna Teja Reddy 103 Dec 29, 2022
Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks Novel and high-performance medical ima

14 Dec 18, 2022
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". **The code is in the "master

杨攀 93 Jan 07, 2023
Faster RCNN with PyTorch

Faster RCNN with PyTorch Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects.

Long Chen 1.6k Dec 23, 2022
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

yidiLi 4 May 08, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
This program will stylize your photos with fast neural style transfer.

Neural Style Transfer (NST) Using TensorFlow Demo TensorFlow TensorFlow is an end-to-end open source platform for machine learning. It has a comprehen

Ismail Boularbah 1 Aug 08, 2022
An automated algorithm to extract the linear blend skinning (LBS) from a set of example poses

Dem Bones This repository contains an implementation of Smooth Skinning Decomposition with Rigid Bones, an automated algorithm to extract the Linear B

Electronic Arts 684 Dec 26, 2022
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21 For more information, check out the paper on [arXiv]. Training with different

Sunghwan Hong 120 Jan 04, 2023
Adjusting for Autocorrelated Errors in Neural Networks for Time Series

Adjusting for Autocorrelated Errors in Neural Networks for Time Series This repository is the official implementation of the paper "Adjusting for Auto

Fan-Keng Sun 51 Nov 05, 2022
3D position tracking for soccer players with multi-camera videos

This repo contains a full pipeline to support 3D position tracking of soccer players, with multi-view calibrated moving/fixed video sequences as inputs.

Yuchang Jiang 72 Dec 27, 2022