Segmentation-Aware Convolutional Networks Using Local Attention Masks

Related tags

Deep Learningsegaware
Overview

Segmentation-Aware Convolutional Networks Using Local Attention Masks

[Project Page] [Paper]

Segmentation-aware convolution filters are invariant to backgrounds. We achieve this in three steps: (i) compute segmentation cues for each pixel (i.e., “embeddings”), (ii) create a foreground mask for each patch, and (iii) combine the masks with convolution, so that the filters only process the local foreground in each image patch.

Installation

For prerequisites, refer to DeepLabV2. Our setup follows theirs almost exactly.

Once you have the prequisites, simply run make all -j4 from within caffe/ to compile the code with 4 cores.

Learning embeddings with dedicated loss

  • Use Convolution layers to create dense embeddings.
  • Use Im2dist to compute dense distance comparisons in an embedding map.
  • Use Im2parity to compute dense label comparisons in a label map.
  • Use DistLoss (with parameters alpha and beta) to set up a contrastive side loss on the distances.

See scripts/segaware/config/embs for a full example.

Setting up a segmentation-aware convolution layer

  • Use Im2col on the input, to arrange pixel/feature patches into columns.
  • Use Im2dist on the embeddings, to get their distances into columns.
  • Use Exp on the distances, with scale: -1, to get them into [0,1].
  • Tile the exponentiated distances, with a factor equal to the depth (i.e., channels) of the original convolution features.
  • Use Eltwise to multiply the Tile result with the Im2col result.
  • Use Convolution with bottom_is_im2col: true to matrix-multiply the convolution weights with the Eltwise output.

See scripts/segaware/config/vgg for an example in which every convolution layer in the VGG16 architecture is made segmentation-aware.

Using a segmentation-aware CRF

  • Use the NormConvMeanfield layer. As input, give it two copies of the unary potentials (produced by a Split layer), some embeddings, and a meshgrid-like input (produced by a DummyData layer with data_filler { type: "xy" }).

See scripts/segaware/config/res for an example in which a segmentation-aware CRF is added to a resnet architecture.

Replicating the segmentation results presented in our paper

  • Download pretrained model weights here, and put that file into scripts/segaware/model/res/.
  • From scripts, run ./test_res.sh. This will produce .mat files in scripts/segaware/features/res/voc_test/mycrf/.
  • From scripts, run ./gen_preds.sh. This will produce colorized .png results in scripts/segaware/results/res/voc_test/mycrf/none/results/VOC2012/Segmentation/comp6_test_cls. An example input-ouput pair is shown below:

- If you zip these results, and submit them to the official PASCAL VOC test server, you will get 79.83900% IOU.

If you run this set of steps for the validation set, you can run ./eval.sh to evaluate your results on the PASCAL VOC validation set. If you change the model, you may want to run ./edit_env.sh to update the evaluation instructions.

Citation

@inproceedings{harley_segaware,
  title = {Segmentation-Aware Convolutional Networks Using Local Attention Masks},
  author = {Adam W Harley, Konstantinos G. Derpanis, Iasonas Kokkinos},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year = {2017},
}

Help

Feel free to open issues on here! Also, I'm pretty good with email: [email protected]

UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down. UpChecker - just run file and use project easy

UpChecker UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down.

Yan 4 Apr 07, 2022
How to Leverage Multimodal EHR Data for Better Medical Predictions?

How to Leverage Multimodal EHR Data for Better Medical Predictions? This repository contains the code of the paper: How to Leverage Multimodal EHR Dat

13 Dec 13, 2022
A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

OpenMMLab 281 Dec 21, 2022
Implementation of the paper "Generating Symbolic Reasoning Problems with Transformer GANs"

Generating Symbolic Reasoning Problems with Transformer GANs This is the implementation of the paper Generating Symbolic Reasoning Problems with Trans

Reactive Systems Group 1 Apr 18, 2022
Neural network for stock price prediction

neural_network_for_stock_price_prediction Neural networks for stock price predic

2 Feb 04, 2022
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
Complete* list of autonomous driving related datasets

AD Datasets Complete* and curated list of autonomous driving related datasets Contributing Contributions are very welcome! To add or update a dataset:

Daniel Bogdoll 13 Dec 19, 2022
TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials

TorchMD-net TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular

TorchMD 104 Jan 03, 2023
Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022
Pointer networks Tensorflow2

Pointer networks Tensorflow2 原文:https://arxiv.org/abs/1506.03134 仅供参考与学习,内含代码备注 环境 tensorflow==2.6.0 tqdm matplotlib numpy 《pointer networks》阅读笔记 应用场景

HUANG HAO 7 Oct 27, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
It's like Shape Editor in Maya but works with skeletons (transforms).

Skeleposer What is Skeleposer? Briefly, it's like Shape Editor in Maya, but works with transforms and joints. It can be used to make complex facial ri

Alexander Zagoruyko 1 Nov 11, 2022
Neural Turing Machines (NTM) - PyTorch Implementation

PyTorch Neural Turing Machine (NTM) PyTorch implementation of Neural Turing Machines (NTM). An NTM is a memory augumented neural network (attached to

Guy Zana 519 Dec 21, 2022
BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

0 Jan 16, 2022
ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

Jie Hu 182 Dec 19, 2022
Install alphafold on the local machine, get out of docker.

AlphaFold This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP

Kui Xu 73 Dec 13, 2022
Predict the latency time of the deep learning models

Deep Neural Network Prediction Step 1. Genernate random parameters and Run them sequentially : $ python3 collect_data.py -gp -ep -pp -pl pooling -num

QAQ 1 Nov 12, 2021
[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

CLNER The code is for our ACL-IJCNLP 2021 paper: Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning CLNER is a

71 Dec 08, 2022
Repository of best practices for deep learning in Julia, inspired by fastai

FastAI Docs: Stable | Dev FastAI.jl is inspired by fastai, and is a repository of best practices for deep learning in Julia. Its goal is to easily ena

FluxML 532 Jan 02, 2023