Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Last update: Jan 05, 2023

Related tags

Deep Learning Polyp-PVT

Overview

Polyp-PVT

by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao.

This repo is the official implementation of "Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers".

1. Introduction

Polyp-PVT is initially described in arxiv.

Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), a and similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT , effectively suppresses noises in the features and significantly improves their expressive capabilities.

Polyp-PVT achieves strong performance on image-level polyp segmentation (0.808 mean Dice and 0.727 mean IoU on ColonDB) and video polyp segmentation (0.880 mean dice and 0.802 mean IoU on CVC-300-TV), surpassing previous models by a large margin.

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:nhhv], including our results and that of compared models.

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:33ie], including our results and that of compared models.

4. Usage:

4.1 Recommended environment:

Python 3.8
Pytorch 1.7.1
torchvision 0.8.2

4.2 Data preparation:

Downloading training and testing datasets and move them into ./dataset/, which can be found in this Google Drive/Baidu Drive [code:dr1h].

4.3 Pretrained model:

You should download the pretrained model from Google Drive/Baidu Drive [code:w4vk], and then put it in the './pretrained_pth' folder for initialization.

4.4 Training:

Clone the repository:

git clone https://github.com/DengPingFan/Polyp-PVT.git
cd Polyp-PVT 
bash train.sh

4.5 Testing:

cd Polyp-PVT 
bash test.sh

4.6 Evaluating your trained model:

Matlab: Please refer to the work of MICCAI2020 (link).

Python: Please refer to the work of ACMMM2021 (link).

Please note that we use the Matlab version to evaluate in our paper.

4.7 Well trained model:

You could download the trained model from Google Drive/Baidu Drive [code:9rpy] and put the model in directory './model_pth'.

4.8 Pre-computed maps:

Google Drive/Baidu Drive [code:x3jc]

5. Citation:

@aticle{dong2021PolypPVT,
  title={Polyp-PVT: Polyp Segmentation with PyramidVision Transformers},
  author={Bo, Dong and Wenhai, Wang and Deng-Ping, Fan and Jinpeng, Li and Huazhu, Fu and Ling, Shao},
  journal={arXiv preprint arXiv:2108.06932},
  year={2021}
}

6. Acknowledgement

We are very grateful for these excellent works PraNet, EAGRNet and MSEG, which have provided the basis for our framework.

7. FAQ:

If you want to improve the usability or any piece of advice, please feel free to contact me directly ([email protected]).

8. License

The source code is free for research and education use only. Any comercial use should get formal permission first.

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Related tags

Overview

Polyp-PVT

1. Introduction

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

4. Usage:

4.1 Recommended environment:

4.2 Data preparation:

4.3 Pretrained model:

4.4 Training:

4.5 Testing:

4.6 Evaluating your trained model:

4.7 Well trained model:

4.8 Pre-computed maps:

5. Citation:

6. Acknowledgement

7. FAQ:

8. License

Owner

Deng-Ping Fan

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

OpenGAN: Open-Set Recognition via Open Data Generation

Tutorial on scikit-learn and IPython for parallel machine learning

code for paper -- "Seamless Satellite-image Synthesis"

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Code & Experiments for "LILA: Language-Informed Latent Actions" to be presented at the Conference on Robot Learning (CoRL) 2021.

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

[CVPR 2021] Forecasting the panoptic segmentation of future video frames

blind SQLIpy sebuah alat injeksi sql yang menggunakan waktu sql untuk mendapatkan sebuah server database.

Code for the paper "M2m: Imbalanced Classification via Major-to-minor Translation" (CVPR 2020)

Invertible conditional GANs for image editing

Train the HRNet model on ImageNet

This is the code for HOI Transformer