Semantic Segmentation in Pytorch

Related tags

Deep Learningsemseg
Overview

PyTorch Semantic Segmentation

Introduction

This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to use for training and testing on various datasets. The codebase mainly uses ResNet50/101/152 as backbone and can be easily adapted to other basic classification structures. Implemented networks including PSPNet and PSANet, which ranked 1st places in ImageNet Scene Parsing Challenge 2016 @ECCV16, LSUN Semantic Segmentation Challenge 2017 @CVPR17 and WAD Drivable Area Segmentation Challenge 2018 @CVPR18. Sample experimented datasets are ADE20K, PASCAL VOC 2012 and Cityscapes.

Update

  • 2020.05.15: Branch master, use official nn.SyncBatchNorm, only multiprocessing training is supported, tested with pytorch 1.4.0.
  • 2019.05.29: Branch 1.0.0, both multithreading training (nn.DataParallel) and multiprocessing training (nn.parallel.DistributedDataParallel) (recommended) are supported. And the later one is much faster. Use syncbn from EncNet and apex, tested with pytorch 1.0.0.

Usage

  1. Highlight:

  2. Requirement:

    • Hardware: 4-8 GPUs (better with >=11G GPU memory)
    • Software: PyTorch>=1.1.0, Python3, tensorboardX,
  3. Clone the repository:

    git clone https://github.com/hszhao/semseg.git
  4. Train:

    • Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder config):

      cd semseg
      mkdir -p dataset
      ln -s /path_to_ade20k_dataset dataset/ade20k
      
    • Download ImageNet pre-trained models and put them under folder initmodel for weight initialization. Remember to use the right dataset format detailed in FAQ.md.

    • Specify the gpu used in config then do training:

      sh tool/train.sh ade20k pspnet50
    • If you are using SLURM for nodes manager, uncomment lines in train.sh and then do training:

      sbatch tool/train.sh ade20k pspnet50
  5. Test:

    • Download trained segmentation models and put them under folder specified in config or modify the specified paths.

    • For full testing (get listed performance):

      sh tool/test.sh ade20k pspnet50
    • Quick demo on one image:

      PYTHONPATH=./ python tool/demo.py --config=config/ade20k/ade20k_pspnet50.yaml --image=figure/demo/ADE_val_00001515.jpg TEST.scales '[1.0]'
  6. Visualization: tensorboardX incorporated for better visualization.

    tensorboard --logdir=exp/ade20k
  7. Other:

    • Resources: GoogleDrive LINK contains shared models, visual predictions and data lists.
    • Models: ImageNet pre-trained models and trained segmentation models can be accessed. Note that our ImageNet pretrained models are slightly different from original ResNet implementation in the beginning part.
    • Predictions: Visual predictions of several models can be accessed.
    • Datasets: attributes (names and colors) are in folder dataset and some sample lists can be accessed.
    • Some FAQs: FAQ.md.
    • Former video predictions: high accuracy -- PSPNet, PSANet; high efficiency -- ICNet.

Performance

Description: mIoU/mAcc/aAcc stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. ss denotes single scale testing and ms indicates multi-scale testing. Training time is measured on a sever with 8 GeForce RTX 2080 Ti. General parameters cross different datasets are listed below:

  • Train Parameters: sync_bn(True), scale_min(0.5), scale_max(2.0), rotate_min(-10), rotate_max(10), zoom_factor(8), ignore_label(255), aux_weight(0.4), batch_size(16), base_lr(1e-2), power(0.9), momentum(0.9), weight_decay(1e-4).
  • Test Parameters: ignore_label(255), scales(single: [1.0], multiple: [0.5 0.75 1.0 1.25 1.5 1.75]).
  1. ADE20K: Train Parameters: classes(150), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(100). Test Parameters: classes(150), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

    • Setting: train on train (20210 images) set and test on val (2000 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.4189/0.5227/0.8039. 0.4284/0.5266/0.8106. 14h
    PSANet50 0.4229/0.5307/0.8032. 0.4305/0.5312/0.8101. 14h
    PSPNet101 0.4310/0.5375/0.8107. 0.4415/0.5426/0.8172. 20h
    PSANet101 0.4337/0.5385/0.8102. 0.4414/0.5392/0.8170. 20h
  2. PSACAL VOC 2012: Train Parameters: classes(21), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(50). Test Parameters: classes(21), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

    • Setting: train on train_aug (10582 images) set and test on val (1449 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.7705/0.8513/0.9489. 0.7802/0.8580/0.9513. 3.3h
    PSANet50 0.7725/0.8569/0.9491. 0.7787/0.8606/0.9508. 3.3h
    PSPNet101 0.7907/0.8636/0.9534. 0.7963/0.8677/0.9550. 5h
    PSANet101 0.7870/0.8642/0.9528. 0.7966/0.8696/0.9549. 5h
  3. Cityscapes: Train Parameters: classes(19), train_h(713/709-PSP/A), train_w(713/709-PSP/A), epochs(200). Test Parameters: classes(19), test_h(713/709-PSP/A), test_w(713/709-PSP/A), base_size(2048).

    • Setting: train on fine_train (2975 images) set and test on fine_val (500 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.7730/0.8431/0.9597. 0.7838/0.8486/0.9617. 7h
    PSANet50 0.7745/0.8461/0.9600. 0.7818/0.8487/0.9622. 7.5h
    PSPNet101 0.7863/0.8577/0.9614. 0.7929/0.8591/0.9638. 10h
    PSANet101 0.7842/0.8599/0.9621. 0.7940/0.8631/0.9644. 10.5h

Citation

If you find the code or trained models useful, please consider citing:

@misc{semseg2019,
  author={Zhao, Hengshuang},
  title={semseg},
  howpublished={\url{https://github.com/hszhao/semseg}},
  year={2019}
}
@inproceedings{zhao2017pspnet,
  title={Pyramid Scene Parsing Network},
  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
  booktitle={CVPR},
  year={2017}
}
@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Question

Some FAQ.md collected. You are welcome to send pull requests or give some advices. Contact information: hengshuangzhao at gmail.com.

Owner
Hengshuang Zhao
Hengshuang Zhao
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

Peidong Liu(刘沛东) 54 Dec 17, 2022
TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Tensorflow- MaskRCNN Steps git clone https://github.com/amalaj7/TFOD-MASKRCNN.gi

Amal Ajay 2 Jan 18, 2022
Implementation of ProteinBERT in Pytorch

ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc

Phil Wang 92 Dec 25, 2022
Easily Process a Batch of Cox Models

ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. ⏬

Shixiang Wang 15 May 23, 2022
PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Non-Autoregressive Transformer Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K.

Salesforce 261 Nov 12, 2022
Epidemiology analysis package

zEpid zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is

Paul Zivich 111 Jan 08, 2023
MT3: Multi-Task Multitrack Music Transcription

MT3: Multi-Task Multitrack Music Transcription MT3 is a multi-instrument automatic music transcription model that uses the T5X framework. This is not

Magenta 867 Dec 29, 2022
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi

Kaidi Cao 528 Jan 01, 2023
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
Extension to fastai for volumetric medical data

FAIMED 3D use fastai to quickly train fully three-dimensional models on radiological data Classification from faimed3d.all import * Load data in vari

Keno 26 Aug 22, 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
PyTorch implementation of EigenGAN

PyTorch Implementation of EigenGAN Train python train.py [image_folder_path] --name [experiment name] Test python test.py [ckpt path] --traverse FFH

62 Nov 12, 2022
Blender Python - Node-based multi-line text and image flowchart

MindMapper v0.8 Node-based text and image flowchart for Blender Mindmap with shortcuts visible: Mindmap with shortcuts hidden: Notes This was requeste

SpectralVectors 58 Oct 08, 2022
Users can free try their models on SIDD dataset based on this code

SIDD benchmark 1 Train python train.py If you want to train your network, just modify the yaml in the options folder. 2 Validation python validation.p

Yuzhi ZHAO 2 May 20, 2022
This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds This repository is a PyTorch implementation for paper: Uns

Kaizhi Yang 42 Dec 09, 2022
using yolox+deepsort for object-tracker

YOLOX_deepsort_tracker yolox+deepsort实现目标跟踪 最新的yolox尝尝鲜~~(yolox正处在频繁更新阶段,因此直接链接yolox仓库作为子模块) Install Clone the repository recursively: git clone --rec

245 Dec 26, 2022
FridaHookAppTool - Frida Hook App Tool With Python

FridaHookAppTool(以下是Hook mpaas框架的例子) mpaas移动开发框架ios端抓包hook脚本 使用方法:链接数据线,开启burp设置

13 Nov 30, 2022
Code for ICLR2018 paper: Improving GAN Training via Binarized Representation Entropy (BRE) Regularization - Y. Cao · W Ding · Y.C. Lui · R. Huang

code for "Improving GAN Training via Binarized Representation Entropy (BRE) Regularization" (ICLR2018 paper) paper: https://arxiv.org/abs/1805.03644 G

21 Oct 12, 2020
A Gura parser implementation for Python

Gura Python parser This repository contains the implementation of a Gura (compliant with version 1.0.0) format parser in Python. Installation pip inst

Gura Config Lang 19 Jan 25, 2022
Solutions of Reinforcement Learning 2nd Edition

Solutions of Reinforcement Learning, An Introduction

YIFAN WANG 1.4k Dec 30, 2022