2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Overview

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Authors: Ge-Peng Ji*, Yu-Cheng Chou*, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, & Ling Shao.

This repository provides code for paper"Progressively Normalized Self-Attention Network for Video Polyp Segmentation" published at the MICCAI-2021 conference (arXiv Version | 中文版). If you have any questions about our paper, feel free to contact me. And if you like our PNS-Net or evaluation toolbox for your personal research, please cite this paper (BibTeX).

Features

  • Hyper Real-time Speed: Our method, named Progressively Normalized Self-Attention Network (PNS-Net), can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single NVIDIA RTX 2080 GPU without any post-processing techniques (e.g., Dense-CRF).
  • Plug-and-Play Module: The proposed core module, termed Normalized Self-attention (NS), utilizes channel split,query-dependent, and normalization rules to reduce the computational cost and improve the accuracy, respectively. Note that this module can be flexibly plugged into any framework customed.
  • Cutting-edge Performance: Experiments on three challenging video polyp segmentation (VPS) datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance.
  • One-key Evaluation Toolbox: We release the first one-key evaluation toolbox in the VPS field.

1.1. 🔥 NEWS 🔥 :

  • [2021/06/25] 🔥 Our paper have been elected to be honred a MICCAI Student Travel Award.
  • [2021/06/19] 🔥 A short introduction of our paper is available on my YouTube channel (2min).
  • [2021/06/18] Release the inference code! The whole project will be available at the time of MICCAI-2021.
  • [2021/06/18] The Chinese translation of our paper is coming, please enjoy it [pdf].
  • [2021/05/27] Uploading the training/testing dataset, snapshot, and benchmarking results.
  • [2021/05/14] Our work is provisionally accepted at MICCAI 2021. Many thanks to my collaborator Yu-Cheng Chou and supervisor Prof. Deng-Ping Fan.
  • [2021/03/10] Create repository.

1.2. Table of Contents

Table of contents generated with markdown-toc

1.3. State-of-the-art Approaches

  1. "PraNet: Parallel Reverse Attention Network for Polyp Segmentation" MICCAI, 2020. doi: https://arxiv.org/pdf/2006.11392.pdf
  2. "Adaptive context selection for polyp segmentation" MICCAI, 2020. doi: https://link.springer.com/chapter/10.1007/978-3-030-59725-2_25
  3. "Resunet++: An advanced architecture for medical image segmentation" IEEE ISM, 2019 doi: https://arxiv.org/pdf/1911.07067.pdf
  4. "Unet++: A nested u-net architecture for medical image segmentation" IEEE TMI, 2019 doi: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329239/
  5. "U-Net: Convolutional networks for biomed- ical image segmentation" MICCAI, 2015. doi: https://arxiv.org/pdf/1505.04597.pdf

2. Overview

2.1. Introduction

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing.

Our PNS-Net is based solely on a basic normalized self-attention block, dispensing with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

2.2. Framework Overview


Figure 1: Overview of the proposed PNS-Net, including the normalized self-attention block (see § 2.1) with a stacked (×R) learning strategy. See § 2 in the paper for details.

2.3. Qualitative Results


Figure 2: Qualitative Results.

3. Proposed Baseline

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with a single GeForce RTX 2080 GPU of 8 GB Memory.

  1. Configuring your environment (Prerequisites):

    Note that PNS-Net is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.

    • Creating a virtual environment in terminal:

    conda create -n PNSNet python=3.6.

    • Installing necessary packages PyTorch 1.1:
    conda create -n PNSNet python=3.6
    conda activate PNSNet
    conda install pytorch=1.1.0 torchvision -c pytorch
    pip install tensorboardX tqdm Pillow==6.2.2
    pip install git+https://github.com/pytorch/[email protected]
    • Our core design is built on CUDA OP with torchlib. Please ensure the base CUDA toolkit version is 10.x (not at conda env), and then build the NS Block:
    cd ./lib/PNS
    python setup.py build develop
  2. Downloading necessary data:

  3. Training Configuration:

    • First, run python MyTrain_Pretrain.py in the terminal for pretraining, and then, run python MyTrain_finetune.py for finetuning.

    • Just enjoy it! Finish it and the snapshot would save in ./snapshot/PNS-Net/*.

  4. Testing Configuration:

    • After you download all the pre-trained model and testing dataset, just run MyTest_finetune.py to generate the final prediction map in ./res.

    • Just enjoy it!

    • The prediction results of all competitors and our PNS-Net can be found at Google Drive (7MB).

3.2 Evaluating your trained model:

One-key evaluation is written in MATLAB code (link), please follow this the instructions in ./eval/main_VPS.m and just run it to generate the evaluation results in ./eval-Result/.

4. Citation

Please cite our paper if you find the work useful:

@inproceedings{ji2021pnsnet,
  title={Progressively Normalized Self-Attention Network for Video Polyp Segmentation},
  author={Ji, Ge-Peng and Chou, Yu-Cheng and Fan, Deng-Ping and Chen, Geng and Jha, Debesh and Fu, Huazhu and Shao, Ling},
  booktitle={MICCAI},
  year={2021}
}

5. TODO LIST

If you want to improve the usability or any piece of advice, please feel free to contact me directly (E-mail).

  • Support NVIDIA APEX training.

  • Support different backbones ( VGGNet, ResNet, ResNeXt, iResNet, and ResNeSt etc.)

  • Support distributed training.

  • Support lightweight architecture and real-time inference, like MobileNet, SqueezeNet.

  • Support distributed training

  • Add more comprehensive competitors.

6. FAQ

  1. If the image cannot be loaded on the page (mostly in the domestic network situations).

    Solution Link


7. Acknowledgements

This code is built on SINetV2 (PyTorch) and PyramidCSA (PyTorch). We thank the authors for sharing the codes.

back to top

Owner
Ge-Peng Ji (Daniel)
Computer Vision & Medical Imaging
Ge-Peng Ji (Daniel)
Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Momentum Capsule Network Official implementation of the paper Momentum Capsule Networks (MoCapsNet). Abstract Capsule networks are a class of neural n

8 Oct 20, 2022
Group-Free 3D Object Detection via Transformers

Group-Free 3D Object Detection via Transformers By Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong. This repo is the official implementation of "Group-

Ze Liu 213 Dec 07, 2022
This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

FlatGCN This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation, submitted to ICASSP2022). Req

Dreamer 2 Aug 09, 2022
Code release for "BoxeR: Box-Attention for 2D and 3D Transformers"

BoxeR By Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees Snoek. This repository is an official implementation of the paper BoxeR: Box-A

Nguyen Duy Kien 111 Dec 07, 2022
[ICME 2021 Oral] CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning This repository is the official PyTorch implementation of CORE-Text, a

Jingyang Lin 18 Aug 11, 2022
This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

hzwer 190 Jan 08, 2023
Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

TimeCycle Code for Learning Correspondence from the Cycle-consistency of Time (CVPR 2019, Oral). The code is developed based on the PyTorch framework,

Xiaolong Wang 706 Nov 29, 2022
Development of IP code based on VIPs and AADM

Sparse Implicit Processes In this repository we include the two different versions of the SIP code developed for the article Sparse Implicit Processes

1 Aug 22, 2022
CTF challenges from redpwnCTF 2021

redpwnCTF 2021 Challenges This repository contains challenges from redpwnCTF 2021 in the rCDS format; challenge information is in the challenge.yaml f

redpwn 27 Dec 07, 2022
Cognate Detection Repository

Cognate Detection Repository Details This repository contains the data for two publications: Challenge Dataset of Cognates and False Friend Pairs from

Diptesh Kanojia 1 Apr 26, 2022
ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data

ARKitScenes This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D

Apple 371 Jan 05, 2023
Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Active Learning for Deep Object Detection via Probabilistic Modeling This repository is the official PyTorch implementation of Active Learning for Dee

NVIDIA Research Projects 130 Jan 06, 2023
Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks This repository contains a TensorFlow implementation of "

Jingwei Zheng 5 Jan 08, 2023
Instance-Dependent Partial Label Learning

Instance-Dependent Partial Label Learning Installation pip install -r requirements.txt Run the Demo benchmark-random mnist python -u main.py --gpu 0 -

17 Dec 29, 2022
code for our ECCV 2020 paper "A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation"

Code for our ECCV (2020) paper A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation. Prerequisites: python == 3.6.8 pytorch ==1.1.0

32 Nov 27, 2022
Official implementation of Meta-StyleSpeech and StyleSpeech

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dongchan Min, Dong Bok Lee, Eunho Yang, and Sung Ju Hwang This is an official code

min95 168 Dec 28, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
Modular Probabilistic Programming on MXNet

MXFusion | | | | Tutorials | Documentation | Contribution Guide MXFusion is a modular deep probabilistic programming library. With MXFusion Modules yo

Amazon 100 Dec 10, 2022
Official code for "Maximum Likelihood Training of Score-Based Diffusion Models", NeurIPS 2021 (spotlight)

Maximum Likelihood Training of Score-Based Diffusion Models This repo contains the official implementation for the paper Maximum Likelihood Training o

Yang Song 84 Dec 12, 2022
Sample and Computation Redistribution for Efficient Face Detection

Introduction SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv. Performance Precision, flops and infer ti

Sajjad Aemmi 13 Mar 05, 2022