PyTorch code for the "Deep Neural Networks with Box Convolutions" paper

Overview

Box Convolution Layer for ConvNets


Single-box-conv network (from `examples/mnist.py`) learns patterns on MNIST

What This Is

This is a PyTorch implementation of the box convolution layer as introduced in the 2018 NeurIPS paper:

Burkov, E., & Lempitsky, V. (2018) Deep Neural Networks with Box Convolutions. Advances in Neural Information Processing Systems 31, 6214-6224.

How to Use

Installing

python3 -m pip install git+https://github.com/shrubb/box-convolutions.git
python3 -m box_convolution.test # if throws errors, please open a GitHub issue

To uninstall:

python3 -m pip uninstall box_convolution

Tested on Ubuntu 18.04.2, Python 3.6, PyTorch 1.0.0, GCC {4.9, 5.5, 6.5, 7.3}, CUDA 9.2. Other versions (e.g. macOS or Python 2.7 or CUDA 8 or CUDA 10) should work too, but I haven't checked. If something doesn't build, please open a Github issue.

Known issues (see this chat):

  • CUDA 9/9.1 + GCC 6 isn't supported due to a bug in NVCC.

You can specify a different compiler with CC environment variable:

CC=g++-7 python3 -m pip install git+https://github.com/shrubb/box-convolutions.git

Using

import torch
from box_convolution import BoxConv2d

box_conv = BoxConv2d(16, 8, 240, 320)
help(BoxConv2d)

Also, there are usage examples in examples/.

Quick Tour of Box convolutions

You may want to see our poster.

Why reinvent the old convolution?

3×3 convolutions are too small ⮕ receptive field grows too slow ⮕ ConvNets have to be very deep.

This is especially undesirable in dense prediction tasks (segmentation, depth estimation, object detection, ...).

Today people solve this by

  • dilated/deformable convolutions (can bring artifacts or degrade to 1×1 conv; almost always filter high-frequency);
  • "global" spatial pooling layers (usually too constrained, fixed size, not "fully convolutional").

How does it work?

Box convolution layer is a basic depthwise convolution (i.e. Conv2d with groups=in_channels) but with special kernels called box kernels.

A box kernel is a rectangular averaging filter. That is, filter values are fixed and unit! Instead, we learn four parameters per rectangle − its size and offset:

image

image

Any success stories?

One example: there is an efficient semantic segmentation model ENet. It's a classical hourglass architecture stacked of dozens ResNet-like blocks (left image).

Let's replace some of these blocks by our "box convolution block" (right image).

First we replaced every second block with a box convolution block (BoxENet in the paper). The model became

  • more accurate,
  • faster,
  • lighter
  • without dilated convolutions.

Then, we replaced every residual block (except the down- and up-sampling ones)! The result, BoxOnlyENet, is

  • a ConvNet almost without (traditional learnable weight) convolutions,
  • 2 times less operations,
  • 3 times less parameters,
  • still more accurate than ENet!
Comments
  • Build problem!

    Build problem!

    Hi! Can't compile pls see log https://drive.google.com/open?id=1U_0axWSgQGsvvdMWv5FclS1hHHihqx9M

    Command "/home/alex/anaconda3/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-n1eyvbz3/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-p0dv1roq/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-n1eyvbz3/

    opened by aidonchuk 63
  • Implementation in VGG

    Implementation in VGG

    Hey,

    I am trying to implement box convolution for HED (Holistically-Nested Edge Detection) which uses VGG architecture. Here's the architecture with box convolution layer:

    class HED(nn.Module):
        def __init__(self):
            super(HED, self).__init__()
    
            # conv1
            self.conv1 = nn.Sequential(
                nn.Conv2d(3, 64, 3, padding=1),
                BoxConv2d(1, 64, 5, 5),
                nn.ReLU(inplace=True),
                nn.Conv2d(64, 64, 3, padding=1),
                #BoxConv2d(1, 64, 28, 28),
                nn.ReLU(inplace=True),
            )
    
            # conv2
            self.conv2 = nn.Sequential(
                nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/2
                nn.Conv2d(64, 128, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(128, 128, 3, padding=1),
                nn.ReLU(inplace=True),
            )
    
            # conv3
            self.conv3 = nn.Sequential(
                nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/4
                nn.Conv2d(128, 256, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(256, 256, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(256, 256, 3, padding=1),
                nn.ReLU(inplace=True),
            )
    
            # conv4
            self.conv4 = nn.Sequential(
                nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/8
                nn.Conv2d(256, 512, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(512, 512, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(512, 512, 3, padding=1),
                nn.ReLU(inplace=True),
            )
    
            # conv5
            self.conv5 = nn.Sequential(
                nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/16
                nn.Conv2d(512, 512, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(512, 512, 3, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(512, 512, 3, padding=1),
                nn.ReLU(inplace=True),
            )
    
            self.dsn1 = nn.Conv2d(64, 1, 1)
            self.dsn2 = nn.Conv2d(128, 1, 1)
            self.dsn3 = nn.Conv2d(256, 1, 1)
            self.dsn4 = nn.Conv2d(512, 1, 1)
            self.dsn5 = nn.Conv2d(512, 1, 1)
            self.fuse = nn.Conv2d(5, 1, 1)
    
        def forward(self, x):
            h = x.size(2)
            w = x.size(3)
    
            conv1 = self.conv1(x)
            conv2 = self.conv2(conv1)
            conv3 = self.conv3(conv2)
            conv4 = self.conv4(conv3)
            conv5 = self.conv5(conv4)
    
            ## side output
            d1 = self.dsn1(conv1)
            d2 = F.upsample_bilinear(self.dsn2(conv2), size=(h,w))
            d3 = F.upsample_bilinear(self.dsn3(conv3), size=(h,w))
            d4 = F.upsample_bilinear(self.dsn4(conv4), size=(h,w))
            d5 = F.upsample_bilinear(self.dsn5(conv5), size=(h,w))
    
            # dsn fusion output
            fuse = self.fuse(torch.cat((d1, d2, d3, d4, d5), 1))
    
            d1 = F.sigmoid(d1)
            d2 = F.sigmoid(d2)
            d3 = F.sigmoid(d3)
            d4 = F.sigmoid(d4)
            d5 = F.sigmoid(d5)
            fuse = F.sigmoid(fuse)
    
            return d1, d2, d3, d4, d5, fuse
    

    I get the following error: RuntimeError: BoxConv2d: all parameters must have as many rows as there are input channels (box_convolution_forward at src/box_convolution_interface.cpp:30)

    Can you help me with this?

    opened by Flock1 10
  • YOLO architecture

    YOLO architecture

    Hi,

    I want to know if there's some way I can create an architecture that'll work with YOLO. I've read a lot of implementations with pytorch but I don't know how should I modify the cfg file so that I can add box convolution layer.

    Let me know.

    opened by Flock1 9
  • Build Problem Windows 10 CUDA10.1 Python Bindings?

    Build Problem Windows 10 CUDA10.1 Python Bindings?

    Hi, I'm trying to compile the box-convolutions using Windows 10 with CUDA 10.1. This results in the following error:

    \python\python36\lib\site-packages\torch\lib\include\pybind11\cast.h(1439): error: expression must be a pointer to a complete object type
    
      1 error detected in the compilation of "C:/Users/CHRIST~1/AppData/Local/Temp/tmpxft_000010ec_00000000-8_integral_image_cuda.cpp4.ii".
      integral_image_cuda.cu
      error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.1\\bin\\nvcc.exe' failed with exit status 2
    
      ----------------------------------------
    Failed building wheel for box-convolution
    Running setup.py clean for box-convolution
    Failed to build box-convolution
    

    Any ideas? Thanks in advance

    opened by tom23141 6
  • Getting a cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.ci:250

    Getting a cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.ci:250

    Hello,

    I've been trying to implement your box convolution layer on a ResNet model by just substituting your BottleneckBoxConv layers for a typical ResNet Bottleneck layer.

    I was getting this error

    THCudaCheck FAIL file=src/box_convolution_cuda_forward.cu line=250 error=9 : invalid configuration argument
    Traceback (most recent call last):
      File "half_box_train.py", line 178, in <module>
        main()
      File "half_box_train.py", line 107, in main
        scores = res_net(x)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 331, in forward
         x = self.layer3(x)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
           input = module(input)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 66, in forward
        return F.relu(x + self.main_branch(x), inplace=True)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_module.py", line 222, in forward
        self.reparametrization_h, self.reparametrization_w, self.normalize, self.exact)
      File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_function.py", line 46, in forward
        input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
    RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:250
    

    Thanks so much!

    opened by dkang9503 5
  • Speed and Efficiency of Depthwise separable operation?

    Speed and Efficiency of Depthwise separable operation?

    As far as modern libraries are concerned, there is not much support for depth-wise separable operations, i.e. we cannot write custom operations that can be done depthwise. Only convolutions are supported.

    How did you apply M box convolutions to each of the N input filters, to generate NM output filters? How is the different than using a for loop over the N input filters, applying M box convs on each one, and concatenating all the results?

    opened by kennyseb 5
  • Import Error

    Import Error

    Success build with ubuntu 16.04, cuda 10 and gcc 7.4. But import error encountered:

    In [1]: import torch
    
    In [2]: from box_convolution import BoxConv2d
    
    

    ImportError                               Traceback (most recent call last)
    <ipython-input-2-2424917dbf01> in <module>()
    ----> 1 from box_convolution import BoxConv2d
    
    ~/Software/pkgs/box-convolutions/box_convolution/__init__.py in <module>()
    ----> 1 from .box_convolution_module import BoxConv2d
    
    ~/Software/pkgs/box-convolutions/box_convolution/box_convolution_module.py in <module>()
          2 import random
          3 
    ----> 4 from .box_convolution_function import BoxConvolutionFunction, reparametrize
          5 import box_convolution_cpp_cuda as cpp_cuda
          6 
    
    ~/Software/pkgs/box-convolutions/box_convolution/box_convolution_function.py in <module>()
          1 import torch
          2 
    ----> 3 import box_convolution_cpp_cuda as cpp_cuda
          4 
          5 def reparametrize(
    
    ImportError: /usr/Software/anaconda3/lib/python3.6/site-packages/box_convolution_cpp_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration
    

    @shrubb

    opened by the-butterfly 5
  • Error during forward pass

    Error during forward pass

         44         input_integrated = cpp_cuda.integral_image(input)
         45         output = cpp_cuda.box_convolution_forward(
    ---> 46             input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
         47 
         48         ctx.save_for_backward(
    
    RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:249```
    opened by belskikh 5
  • Test script failed

    Test script failed

    Test script assertion failed:

    Random seed is 1546545757 Testing for device 'cpu' Running test_integral_image()... 100%|| 50/50 [00:00<00:00, 1491.15it/s] OK Running test_box_convolution_module()... 0%| python3: /pytorch/third_party/ideep/mkl-dnn/src/cpu/jit_avx2_conv_kernel_f32.cpp:567: static mkldnn::impl::status_t mkldnn::impl::cpu::jit_avx2_conv_fwd_kernel_f32::init_conf(mkldnn::impl::cpu::jit_conv_conf_t&, const convolution_desc_t&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const primitive_attr_t&): Assertion `jcp.ur_w * (jcp.nb_oc_blocking + 1) <= num_avail_regs' failed. Aborted (core dumped)

    Configuration: Ubuntu 16.04 LTS, CUDA 9.2, PyTorch 1.1.0, GCC 5.4.0.

    opened by vtereshkov 4
  • how box convolution works

    how box convolution works

    Hi,

    It is a nice work. In the first figure on your poster, you compared the 3x3 convolution layer and your box convolution layer. I am not clear how the box convolution works. Is it right that for each position (p,q) on the image, you use a box filter which has a relative position x, y to (p,q) and size w,h to calculate the value for (p,q) on the output? You learn the 4 parameters x, y, w, h for each box filter. For example, in the figure, the value for the red anchor pixel position on the output should be the sum of the values in the box. Is it correct? Thanks.

    opened by jiaozizhao 4
  • Multi-GPU Training: distributed error encountered

    Multi-GPU Training: distributed error encountered

    I am using https://github.com/facebookresearch/maskrcnn-benchmark for object detection, I want to use box convolutions, when I add a box convolution after some layer, training with 1 GPU is OK, while training with multiple GPUs in distributed mode failed, the error is very similar to this issue, I do not know how to fix, have some ideas? @shrubb

    2019-02-18 16:09:15,187 maskrcnn_benchmark.trainer INFO: Start training
    Traceback (most recent call last):
      File "tools/train_net.py", line 172, in <module>
        main()
      File "tools/train_net.py", line 165, in main
        model = train(cfg, args.local_rank, args.distributed)
      File "tools/train_net.py", line 74, in train
        arguments,
      File "/srv/data0/hzxubinbin/projects/maskrcnn_benchmark/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 79, in do_train
        losses.backward()
      File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/tensor.py", line 102, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph)
      File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
        allow_unreachable=True)  # allow_unreachable flag
      File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 445, in distributed_data_parallel_hook
        self._queue_reduction(bucket_idx)
      File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 475, in _queue_reduction
        self.device_ids)
    TypeError: _queue_reduction(): incompatible function arguments. The following argument types are supported:
        1. (process_group: torch.distributed.ProcessGroup, grads_batch: List[List[at::Tensor]], devices: List[int]) -> Tuple[torch.distributed.Work, at::Tensor]
    
    Invoked with: <torch.distributed.ProcessGroupNCCL object at 0x7f0d95248148>, [[tensor([[[[0.]],
    

    1 GPU is too slow, I want to use multiple GPUs

    opened by freesouls 4
  • How can I run Cityscapes example on a test set?

    How can I run Cityscapes example on a test set?

    Hello, collegues! I've trained BoxERFNet, and now I wanna run this model on a test set to evaluate it. I checked the source code(train.py) and established 'test' in place of 'test' in 80th string. But there was falure, the evaluated metrics were incorrect(e.g. 0.0 and 0.0). Can you explain me, what I need to do to evaluate model on a test set? I guess that problem is on 'validate' function(241th string), because confusion_matrix_update(268th string) tensors are really different in test and val sets.

    opened by mikhailkonyk 3
Releases(v1.0.0)
Owner
Egor Burkov
Egor Burkov
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch = 1.2.0 P

16 Dec 14, 2022
A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

#360Diffusion automatically upscales your CLIP Guided Diffusion outputs using Real-ESRGAN. Latest Update: Alpha 1.61 [Main Branch] - 01/11/22 Layout a

78 Nov 02, 2022
Volumetric parameterization of the placenta to a flattened template

placenta-flattening A MATLAB algorithm for volumetric mesh parameterization. Developed for mapping a placenta segmentation derived from an MRI image t

Mazdak Abulnaga 12 Mar 14, 2022
Kaggle G2Net Gravitational Wave Detection : 2nd place solution

Kaggle G2Net Gravitational Wave Detection : 2nd place solution

Hiroshechka Y 33 Dec 26, 2022
Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

PonderNet-TensorFlow This is an Unofficial Implementation of the paper: PonderNet: Learning to Ponder in TensorFlow. Official PyTorch Implementation:

1 Oct 23, 2022
Code for: https://berkeleyautomation.github.io/bags/

DeformableRavens Code for the paper Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks. Here is the

Daniel Seita 121 Dec 30, 2022
Voice assistant - Voice assistant with python

🌐 Python Voice Assistant 🌵 - User's greeting 🌵 - Writing tasks to todo-list ?

PythonToday 10 Dec 26, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

VoxHRNet This is the official implementation of the following paper: Whole Brain Segmentation with Full Volume Neural Network Yeshu Li, Jonathan Cui,

Microsoft 12 Nov 24, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP

Duo Li 1.3k Dec 28, 2022
Drslmarkov - Distributionally Robust Structure Learning for Discrete Pairwise Markov Networks

Distributionally Robust Structure Learning for Discrete Pairwise Markov Networks

1 Nov 24, 2022
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

Realcat 270 Jan 07, 2023
This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

TinyWeaklyIsolationForest This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised a

2 Mar 21, 2022
Repository for GNSS-based position estimation using a Deep Neural Network

Code repository accompanying our work on 'Improving GNSS Positioning using Neural Network-based Corrections'. In this paper, we present a Deep Neural

32 Dec 13, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Jan 04, 2023
Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling" Pipeline of Tip-Adapter Tip-Adapter can provid

peng gao 187 Dec 28, 2022
a minimal terminal with python 😎😉

Meterm a terminal with python 😎 How to use Clone Project: $ git clone https://github.com/motahharm/meterm.git Run: in Terminal: meterm.exe Or pip ins

Motahhar.Mokfi 5 Jan 28, 2022
Yolov5 deepsort inference,使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中

使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中。

813 Dec 31, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022