TensorFlow-based implementation of "Pyramid Scene Parsing Network".

Last update: Dec 20, 2022

Overview

PSPNet_tensorflow

Important

Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If you want to train from scratch, you need to implement the Synchronize BN layer first to do large batch-size training (as described in the paper). It seems that this repo has reproduced it, you can take a look on it.

Introduction

This is an implementation of PSPNet in TensorFlow for semantic segmentation on the cityscapes dataset. We first convert weight from Original Code by using caffe-tensorflow framework.

Update:

News (2018.11.08 updated):

Now you can try PSPNet on your own image online using ModelDepot live demo!

2018/01/24:

Support evaluation code for ade20k dataset

2018/01/19:

Support inference phase for ade20k dataset using model of pspnet50 (convert weights from original author)
Using tf.matmul to decode label, so as to improve the speed of inference.

2017/11/06:

Support different input size by padding input image to (720, 720) if original size is smaller than it, and get result by cropping image in the end.

2017/10/27:

Change bn layer from tf.nn.batch_normalization into tf.layers.batch_normalization in order to support training phase. Also update initial model in Google Drive.

Install

Get restore checkpoint from Google Drive and put into model directory. Note: Select the checkpoint corresponding to the dataset.

Inference

To get result on your own images, use the following command:

python inference.py --img-path=./input/test.png --dataset cityscapes

Inference time: ~0.6s

Options:

--dataset cityscapes or ade20k
--flipped-eval 
--checkpoints /PATH/TO/CHECKPOINT_DIR

Evaluation

Cityscapes

Perform in single-scaled model on the cityscapes validation datase.

Method	Accuracy
Without flip	76.99%
Flip	77.23%

ade20k

Method	Accuracy
Without flip	40.00%
Flip	40.67%

To re-produce evluation results, do following steps:

Download Cityscape dataset or ADE20k dataset first.
change data_dir to your dataset path in evaluate.py:

'data_dir': ' = /Path/to/dataset'

Run the following command:

python evaluate.py --dataset cityscapes

List of Args:

--dataset - ade20k or cityscapes
--flipped-eval  - Using flipped evaluation method
--measure-time  - Calculate inference time

Image Result

cityscapes

Input image	Output image

ade20k

Input image	Output image

real world

Input image	Output image

Citation

@article{zhao2017pspnet,
  author = {Hengshuang Zhao and
            Jianping Shi and
            Xiaojuan Qi and
            Xiaogang Wang and
            Jiaya Jia},
  title = {Pyramid Scene Parsing Network},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
}

Scene Parsing through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. Computer Vision and Pattern Recognition (CVPR), 2017. (http://people.csail.mit.edu/bzhou/publication/scene-parse-camera-ready.pdf)

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

Semantic Understanding of Scenes through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. arXiv:1608.05442. (https://arxiv.org/pdf/1608.05442.pdf)

@article{zhou2016semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={arXiv preprint arXiv:1608.05442},
  year={2016}
}

TensorFlow-based implementation of "Pyramid Scene Parsing Network".

Related tags

Overview

PSPNet_tensorflow

Important

Introduction

Update:

News (2018.11.08 updated):

2018/01/24:

2018/01/19:

2017/11/06:

2017/10/27:

Install

Inference

Evaluation

Cityscapes

ade20k

Image Result

cityscapes

ade20k

real world

Citation

Owner

HsuanKung Yang

This repository contains PyTorch code for Robust Vision Transformers.

Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

A pre-trained language model for social media text in Spanish

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Myia prototyping

Official code repository for the EMNLP 2021 paper

(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Official PaddlePaddle implementation of Paint Transformer

Voxel-based Network for Shape Completion by Leveraging Edge Generation (ICCV 2021, oral)

CARL provides highly configurable contextual extensions to several well-known RL environments.

KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Evaluation suite for large-scale language models.

DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

GeneralOCR is open source Optical Character Recognition based on PyTorch.

一些经典的CTR算法的复现; LR, FM, FFM, AFM, DeepFM，xDeepFM, PNN, DCN, DCNv2, DIFM, AutoInt, FiBiNet,AFN,ONN,DIN, DIEN ... （pytorch, tf2.0）