2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Overview

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Authors: Ge-Peng Ji*, Yu-Cheng Chou*, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, & Ling Shao.

This repository provides code for paper"Progressively Normalized Self-Attention Network for Video Polyp Segmentation" published at the MICCAI-2021 conference (arXiv Version | 中文版). If you have any questions about our paper, feel free to contact me. And if you like our PNS-Net or evaluation toolbox for your personal research, please cite this paper (BibTeX).

Features

  • Hyper Real-time Speed: Our method, named Progressively Normalized Self-Attention Network (PNS-Net), can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single NVIDIA RTX 2080 GPU without any post-processing techniques (e.g., Dense-CRF).
  • Plug-and-Play Module: The proposed core module, termed Normalized Self-attention (NS), utilizes channel split,query-dependent, and normalization rules to reduce the computational cost and improve the accuracy, respectively. Note that this module can be flexibly plugged into any framework customed.
  • Cutting-edge Performance: Experiments on three challenging video polyp segmentation (VPS) datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance.
  • One-key Evaluation Toolbox: We release the first one-key evaluation toolbox in the VPS field.

1.1. 🔥 NEWS 🔥 :

  • [2021/06/25] 🔥 Our paper have been elected to be honred a MICCAI Student Travel Award.
  • [2021/06/19] 🔥 A short introduction of our paper is available on my YouTube channel (2min).
  • [2021/06/18] Release the inference code! The whole project will be available at the time of MICCAI-2021.
  • [2021/06/18] The Chinese translation of our paper is coming, please enjoy it [pdf].
  • [2021/05/27] Uploading the training/testing dataset, snapshot, and benchmarking results.
  • [2021/05/14] Our work is provisionally accepted at MICCAI 2021. Many thanks to my collaborator Yu-Cheng Chou and supervisor Prof. Deng-Ping Fan.
  • [2021/03/10] Create repository.

1.2. Table of Contents

Table of contents generated with markdown-toc

1.3. State-of-the-art Approaches

  1. "PraNet: Parallel Reverse Attention Network for Polyp Segmentation" MICCAI, 2020. doi: https://arxiv.org/pdf/2006.11392.pdf
  2. "Adaptive context selection for polyp segmentation" MICCAI, 2020. doi: https://link.springer.com/chapter/10.1007/978-3-030-59725-2_25
  3. "Resunet++: An advanced architecture for medical image segmentation" IEEE ISM, 2019 doi: https://arxiv.org/pdf/1911.07067.pdf
  4. "Unet++: A nested u-net architecture for medical image segmentation" IEEE TMI, 2019 doi: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329239/
  5. "U-Net: Convolutional networks for biomed- ical image segmentation" MICCAI, 2015. doi: https://arxiv.org/pdf/1505.04597.pdf

2. Overview

2.1. Introduction

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing.

Our PNS-Net is based solely on a basic normalized self-attention block, dispensing with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

2.2. Framework Overview


Figure 1: Overview of the proposed PNS-Net, including the normalized self-attention block (see § 2.1) with a stacked (×R) learning strategy. See § 2 in the paper for details.

2.3. Qualitative Results


Figure 2: Qualitative Results.

3. Proposed Baseline

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with a single GeForce RTX 2080 GPU of 8 GB Memory.

  1. Configuring your environment (Prerequisites):

    Note that PNS-Net is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.

    • Creating a virtual environment in terminal:

    conda create -n PNSNet python=3.6.

    • Installing necessary packages PyTorch 1.1:
    conda create -n PNSNet python=3.6
    conda activate PNSNet
    conda install pytorch=1.1.0 torchvision -c pytorch
    pip install tensorboardX tqdm Pillow==6.2.2
    pip install git+https://github.com/pytorch/[email protected]
    • Our core design is built on CUDA OP with torchlib. Please ensure the base CUDA toolkit version is 10.x (not at conda env), and then build the NS Block:
    cd ./lib/PNS
    python setup.py build develop
  2. Downloading necessary data:

  3. Training Configuration:

    • First, run python MyTrain_Pretrain.py in the terminal for pretraining, and then, run python MyTrain_finetune.py for finetuning.

    • Just enjoy it! Finish it and the snapshot would save in ./snapshot/PNS-Net/*.

  4. Testing Configuration:

    • After you download all the pre-trained model and testing dataset, just run MyTest_finetune.py to generate the final prediction map in ./res.

    • Just enjoy it!

    • The prediction results of all competitors and our PNS-Net can be found at Google Drive (7MB).

3.2 Evaluating your trained model:

One-key evaluation is written in MATLAB code (link), please follow this the instructions in ./eval/main_VPS.m and just run it to generate the evaluation results in ./eval-Result/.

4. Citation

Please cite our paper if you find the work useful:

@inproceedings{ji2021pnsnet,
  title={Progressively Normalized Self-Attention Network for Video Polyp Segmentation},
  author={Ji, Ge-Peng and Chou, Yu-Cheng and Fan, Deng-Ping and Chen, Geng and Jha, Debesh and Fu, Huazhu and Shao, Ling},
  booktitle={MICCAI},
  year={2021}
}

5. TODO LIST

If you want to improve the usability or any piece of advice, please feel free to contact me directly (E-mail).

  • Support NVIDIA APEX training.

  • Support different backbones ( VGGNet, ResNet, ResNeXt, iResNet, and ResNeSt etc.)

  • Support distributed training.

  • Support lightweight architecture and real-time inference, like MobileNet, SqueezeNet.

  • Support distributed training

  • Add more comprehensive competitors.

6. FAQ

  1. If the image cannot be loaded on the page (mostly in the domestic network situations).

    Solution Link


7. Acknowledgements

This code is built on SINetV2 (PyTorch) and PyramidCSA (PyTorch). We thank the authors for sharing the codes.

back to top

Owner
Ge-Peng Ji (Daniel)
Computer Vision & Medical Imaging
Ge-Peng Ji (Daniel)
Does Pretraining for Summarization Reuqire Knowledge Transfer?

Pretraining summarization models using a corpus of nonsense

Approximately Correct Machine Intelligence (ACMI) Lab 12 Dec 19, 2022
MultiLexNorm 2021 competition system from ÚFAL

ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5 David Samuel & Milan Straka Charles University Faculty of

ÚFAL 13 Jun 28, 2022
Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR) This is the official implementation of our paper Personalized Tran

Yongchun Zhu 81 Dec 29, 2022
CausaLM: Causal Model Explanation Through Counterfactual Language Models

CausaLM: Causal Model Explanation Through Counterfactual Language Models Authors: Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart Abstract: Understan

Amir Feder 39 Jul 10, 2022
Pytorch cuda extension of grid_sample1d

Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo

lyricpoem 24 Dec 03, 2022
Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Implicit-Semantic-Response-Alignment Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation" Prerequisites pyt

4 Dec 19, 2022
CS583: Deep Learning

CS583: Deep Learning

Shusen Wang 2.6k Dec 30, 2022
DenseNet Implementation in Keras with ImageNet Pretrained Models

DenseNet-Keras with ImageNet Pretrained Models This is an Keras implementation of DenseNet with ImageNet pretrained weights. The weights are converted

Felix Yu 568 Oct 31, 2022
PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Future urban scene generation through vehicle synthesis This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Th

Alessandro Simoni 4 Oct 11, 2021
Official repository of Semantic Image Matting

Semantic Image Matting This is the official repository of Semantic Image Matting (CVPR2021). Overview Natural image matting separates the foreground f

192 Dec 29, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

47 Dec 28, 2022
Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2

Graph Transformer - Pytorch Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2. This was recently used by bot

Phil Wang 97 Dec 28, 2022
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation

CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation We propose a novel approach to translate unpaired contrast computed

Nicolae Catalin Ristea 13 Jan 02, 2023
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

ROSITA News & Updates (24/08/2021) Release the demo to perform fine-grained semantic alignments using the pretrained ROSITA model. (15/08/2021) Releas

Vision and Language Group@ MIL 48 Dec 23, 2022
Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Sci - cpy README is a stub. Do expand it. Objective This repository is meant to be a ready reference for scientific computation methods. Do ⭐ it if yo

Sandip Dutta 7 Oct 12, 2022
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Thomas Vuillaume 1 Dec 10, 2021
Evaluating saliency methods on artificial data with different background types

Evaluating saliency methods on artificial data with different background types This repository contains the relevant code for the MedNeurips 2021 subm

2 Jul 05, 2022
PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear

Ahmed Gad 1.1k Dec 26, 2022
Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.

SFace Code of TIP2021 Paper 《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet, PyTorch and Jittor versi

Zhong Yaoyao 47 Nov 25, 2022