Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Overview

Training Reproduce of PSPNet.

(Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with supporting Sync Batch Norm, see https://github.com/hszhao/semseg.)

(Updated 2019/02/26. A major change of code structure. For the version before, checkout v0.9 https://github.com/holyseven/PSPNet-TF-Reproduce/tree/v0.9.)

This is an implementation of PSPNet (from training to test) in pure Tensorflow library (tested on TF1.12, Python 3).

  • Supported Backbones: ResNet-V1-50, ResNet-V1-101 and other ResNet-V1s can be easily added.
  • Supported Databases: ADE20K, SBD (Augmented Pascal VOC) and Cityscapes.
  • Supported Modes: training, validation and inference with multi-scale inputs.
  • More things: L2-SP regularization and sync batch normalization implementation.

L2-SP Regularization

L2-SP regularization is a variant of L2 regularization. Instead of the origin like L2 does, L2-SP sets the pre-trained model as reference, just like (w - w0)^2, where w0 is the pre-trained model. Simple but effective. More details about L2-SP can be found in the paper and the code.

If you find the L2-SP useful for your research (not limited in image segmentation), please consider citing our work:

@inproceedings{li2018explicit,
  author    = {Li, Xuhong and Grandvalet, Yves and Davoine, Franck},
  title     = {Explicit Inductive Bias for Transfer Learning with Convolutional Networks},
  booktitle={International Conference on Machine Learning (ICML)},
   pages     = {2830--2839},
  year      = {2018}
}

Sync Batch Norm

When concerning image segmentation, batch size is usually limited. Small batch size will make the gradients instable and harm the performance, especially for batch normalization layers. Multi-GPU settings by default does not help because the statistics in batch normalization layer are computed independently within each GPU. More discussion can be found here and here.

This repo resolves this problem in pure python and pure Tensorflow by simply using a list as input. The main idea is located in model/utils_mg.py

I do not know if this is the first implementation of sync batch norm in Tensorflow, but there is already an implementation in PyTorch and some applications.

Update: There is other implementation that uses NCCL to gather statistics across GPUs, see in tensorpack. However, TF1.1 does not support gradients passing by nccl_all_reduce. Plus, ppc64le with tf1.10, cuda9.0 and nccl1.3.5 was not able to run this code. No idea why, and do not want to spend a lot of time on this. Maybe nccl2 can solve this.

Results

Numerical Results

  • Random scaling for all
  • Random rotation for SBD
  • SS/MS on validation set
  • Welcome to correct and fill in the table
Backbones L2 L2-SP
Cityscapes (train set: 3K) ResNet-50 76.9/? 77.9/?
ResNet-101 77.9/? 78.6/?
Cityscapes (coarse + train set: 20K + 3K) ResNet-50
ResNet-101 80.0/80.9 80.1/81.2*
SBD ResNet-50 76.5/? 76.6/?
ResNet-101 77.5/79.2 78.5/79.9
ADE20K ResNet-50 41.92/43.09
ResNet-101 42.80/?

*This model gets 80.3 without post-processing methods on Cityscapes test set (1525).

Qualitative Results on Cityscapes

Devil Details

Training and Evaluation

Download the databases with the links: ADE20K, SBD (Augmented Pascal VOC) and Cityscapes.

Prepare the database for Cityscapes by generating *labelTrainIds.png images with createTrainIdLabelImgs, and then change the code in database/reader.py or move undersired images to other directory.

Download pretrained models.

cd z_pretrained_weights
sh download_resnet_v1_101.sh

A script of training resnet-50 on ADE20K, getting around 41.92 mIoU scores (with single-scale test):

python ./run.py --network 'resnet_v1_50' --visible_gpus '0,1' --reader_method 'queue' --lrn_rate 0.01 --weight_decay_mode 0 --weight_decay_rate 0.0001 --weight_decay_rate2 0.001 --database 'ADE' --subsets_for_training 'train' --batch_size 8 --train_image_size 480 --snapshot 30000 --train_max_iter 90000 --test_image_size 480 --random_rotate 0 --fine_tune_filename './z_pretrained_weights/resnet_v1_50.ckpt'

Test and Infer

Test with multi-scale (set batch_size as large as you can to speed up).

python predict.py --visible_gpus '0' --network 'resnet_v1_101' --database 'ADE' --weights_ckpt './log/ADE/PSP-resnet_v1_101-gpu_num2-batch_size8-lrn_rate0.01-random_scale1-random_rotate1-480-60000-train-1-0.0001-0.001-0-0-1-1/snapshot/model.ckpt-60000' --test_subset 'val' --test_image_size 480 --batch_size 8 --ms 1 --mirror 1

Infer one image (with multi-scale).

python demo_infer.py --database 'Cityscapes' --network 'resnet_v1_101' --weights_ckpt './log/Cityscapes/old/model.ckpt-50000' --test_image_size 864 --batch_size 4 --ms 1

Uncertainties for Training Details:

  1. (Cityscapes only) Whether finely labeled data in the first training stage should be involved?
  2. (Cityscapes only) Whether the (base) learning rate should be reduced in the second training stage?
  3. Whether logits should be resized to original size before computing the loss?
  4. Whether new layers should receive larger learning rate?
  5. About weired padding behavior of tf.image.resize_images(). Whether the align_corners=True should be set?
  6. What is optimal hyperparameter of decay for statistics of batch normalization layers? (0.9, 0.95, 0.9997)
  7. may be more but not sure how much these little changes can effect the results ...
  8. Welcome to discuss !

Change Log

26 Febuary, 2019

  • Code structure: on-the-fly evaluation during training.
  • Code structure: wrapping of the model.
  • Add tf.data support, but with queue-based reader is faster.
  • print results using python utils.py in experiment_manager dir.
  • The default environment is Python 3 and TF1.12. OpenCV is needed for predicting and demo_infer.
  • The previous version becomes a branch of this repo named as v0.9.

External links

Pyramid Scene Parsing Network paper and official github.

Owner
Li Xuhong
Researcher at Baidu Research, focus on interpretable deep learning and transfer learning.
Li Xuhong
Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

FGR This repository contains the python implementation for paper "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection"(I

Yi Wei 31 Dec 08, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
Using Machine Learning to Create High-Res Fine Art

BIG.art: Using Machine Learning to Create High-Res Fine Art How to use GLIDE and BSRGAN to create ultra-high-resolution paintings with fine details By

Robert A. Gonsalves 13 Nov 27, 2022
An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

DTech.AIML An AI made using artificial intelligence (AI) and machine learning algorithms (ML) . This is created by help of some members in my team and

1 Jan 06, 2022
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022
Security evaluation module with onnx, pytorch, and SecML.

🚀 🐼 🔥 PandaVision Integrate and automate security evaluations with onnx, pytorch, and SecML! Installation Starting the server without Docker If you

Maura Pintor 11 Apr 12, 2022
This is the official pytorch implementation of AutoDebias, an automatic debiasing method for recommendation.

AutoDebias This is the official pytorch implementation of AutoDebias, a debiasing method for recommendation system. AutoDebias is proposed in the pape

Dong Hande 77 Nov 25, 2022
Extremely simple and fast extreme multi-class and multi-label classifiers.

napkinXC napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various m

Marek Wydmuch 43 Nov 14, 2022
CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

bigAIdream projects 10 Oct 10, 2022
Source code for the BMVC-2021 paper "SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation".

SimReg: A Simple Regression Based Framework for Self-supervised Knowledge Distillation Source code for the paper "SimReg: Regression as a Simple Yet E

9 Oct 15, 2022
A fast model to compute optical flow between two input images.

DCVNet: Dilated Cost Volumes for Fast Optical Flow This repository contains our implementation of the paper: @InProceedings{jiang2021dcvnet, title={

Huaizu Jiang 8 Sep 27, 2021
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

184 Jan 03, 2023
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

11 Nov 23, 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes This repository contains the source code accompanying the paper: FlexConv: C

Robert-Jan Bruintjes 96 Dec 12, 2022
Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

Qibin (Andrew) Hou 726 Jan 05, 2023
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Environments Effi

Weirui Ye 671 Jan 03, 2023
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
An LSTM based GAN for Human motion synthesis

GAN-motion-Prediction An LSTM based GAN for motion synthesis has a few issues reading H3.6M data from A.Jain et al , will fix soon. Prediction of the

Amogh Adishesha 9 Jun 17, 2022
Implementation of average- and worst-case robust flatness measures for adversarial training.

Relating Adversarially Robust Generalization to Flat Minima This repository contains code corresponding to the MLSys'21 paper: D. Stutz, M. Hein, B. S

David Stutz 13 Nov 27, 2022