Pyramid Scene Parsing Network, CVPR2017.

Related tags

Deep LearningPSPNet
Overview

Pyramid Scene Parsing Network

by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page.

Introduction

This repository is for 'Pyramid Scene Parsing Network', which ranked 1st place in ImageNet Scene Parsing Challenge 2016. The code is modified from Caffe version of DeepLab v2 and yjxiong for evaluation. We merge the batch normalization layer named 'bn_layer' in the former one into the later one while keep the original 'batch_norm_layer' in the later one unchanged for compatibility. The difference is that 'bn_layer' contains four parameters as 'slope,bias,mean,variance' while 'batch_norm_layer' contains two parameters as 'mean,variance'. Several evaluation code is borrowed from MIT Scene Parsing.

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Installation

For installation, please follow the instructions of Caffe and DeepLab v2. To enable cuDNN for GPU acceleration, cuDNN v4 is needed. If you meet error related with 'matio', please download and install matio as required in 'DeepLab v2'.

The code has been tested successfully on Ubuntu 14.04 and 12.04 with CUDA 7.0.

Usage

  1. Clone the repository:

    git clone https://github.com/hszhao/PSPNet.git
  2. Build Caffe and matcaffe:

    cd $PSPNET_ROOT
    cp Makefile.config.example Makefile.config
    vim Makefile.config
    make -j8 && make matcaffe
  3. Evaluation:

    • Evaluation code is in folder 'evaluation'.
    • Download trained models and put them in folder 'evaluation/model':
    • Modify the related paths in 'eval_all.m':
      • Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.
      • Matlab 'parfor' evaluation is used and the default GPUs are with ID [0:3]. Modify variable 'gpu_id_array' if needed. We assume that number of images can be divided by number of GPUs; if not, you can just pad your image list or switch to single GPU evaluation by set 'gpu_id_array' be length of one, and change 'parfor' to 'for' loop.
    cd evaluation
    vim eval_all.m
    • Run the evaluation scripts:
    ./run.sh
    
  4. Results:

    Prediction results will show in folder 'evaluation/mc_result' and the expected scores are:

    (single scale testing denotes as 'ss' and multiple scale testing denotes as 'ms')

    • PSPNet50 on ADE20K valset (mIoU/pAcc): 41.68/80.04 (ss) and 42.78/80.76 (ms)
    • PSPNet101 on VOC2012 testset (mIoU): 85.41 (ms)
    • PSPNet101 on cityscapes valset (mIoU/pAcc): 79.70/96.38 (ss) and 80.91/96.59 (ms)
  5. Demo video:

    Video processed by PSPNet101 on cityscapes dataset:

    Merge with colormap on side: Video1

    Alpha blending with value as 0.5: Video2

Citation

If PSPNet is useful for your research, please consider citing:

@inproceedings{zhao2017pspnet,
  title={Pyramid Scene Parsing Network},
  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
  booktitle={CVPR},
  year={2017}
}

Questions

Please contact '[email protected]'

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership Codes for [NeurIPS'21] You are caught stealing my winni

VITA 8 Nov 01, 2022
A Factor Model for Persistence in Investment Manager Performance

Factor-Model-Manager-Performance A Factor Model for Persistence in Investment Manager Performance I apply methods and processes similar to those used

Omid Arhami 1 Dec 01, 2021
CUDA Python Low-level Bindings

CUDA Python Low-level Bindings

NVIDIA Corporation 529 Jan 03, 2023
The `rtdl` library + The official implementation of the paper

The `rtdl` library + The official implementation of the paper "Revisiting Deep Learning Models for Tabular Data"

Yandex Research 510 Dec 30, 2022
Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

41 Jan 04, 2023
MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

MetaDrive: Composing Diverse Driving Scenarios for Generalizable RL [ Documentation | Demo Video ] MetaDrive is a driving simulator with the following

DeciForce: Crossroads of Machine Perception and Autonomy 276 Jan 04, 2023
ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

Lihe Yang 147 Jan 03, 2023
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks [Paper] [Project Website] This repository holds the source code, pretra

Humam Alwassel 83 Dec 21, 2022
PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

VGPL-Visual-Prior PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Give

Toru 8 Dec 29, 2022
Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

cosFormer Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention Update log 2022/2/28 Add core code License This

120 Dec 15, 2022
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois

THUNLP 19 Nov 20, 2022
ML From Scratch

ML from Scratch MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Clustering K Nearest Neighbours Decision

Tanishq Gautam 66 Nov 02, 2022
Various operations like path tracking, counting, etc by using yolov5

Object-tracing-with-YOLOv5 Various operations like path tracking, counting, etc by using yolov5

Pawan Valluri 5 Nov 28, 2022
Generative Adversarial Text-to-Image Synthesis

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee This is the

Scott Ellison Reed 883 Dec 31, 2022
ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha

AI2 65 Dec 30, 2022
Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design". CoordAttention tensorflow slim

Billy 9 Aug 22, 2022
null

DeformingThings4D dataset Video | Paper DeformingThings4D is an synthetic dataset containing 1,972 animation sequences spanning 31 categories of human

208 Jan 03, 2023
A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

Tencent 956 Jan 01, 2023
PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

Asha Anoosheh 139 Dec 20, 2022