A simple consistency training framework for semi-supervised image semantic segmentation

Overview

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

PseudoSeg is a simple consistency training framework for semi-supervised image semantic segmentation, which has a simple and novel re-design of pseudo-labeling to generate well-calibrated structured pseudo labels for training with unlabeled or weakly-labeled data. It is implemented by Yuliang Zou (research intern) in 2020 Summer.

This is not an official Google product.

Instruction

Installation

  • Use a virtual environment
virtualenv -p python3 --system-site-packages env
source env/bin/activate
  • Install packages
pip install -r requirements.txt

Dataset

Create a dataset folder under the ROOT directory, then download the pre-created tfrecords for voc12 and coco, and extract them in dataset folder. You may also want to check the filenames for each split under data_splits folder.

Training

NOTE:

  • We train all our models using 16 V100 GPUs.
  • The ImageNet pre-trained models can be download here.
  • For VOC12, ${SPLIT} can be 2_clean, 4_clean, 8_clean, 16_clean_3 (representing 1/2, 1/4, 1/8, and 1/16 splits), NUM_ITERATIONS should be set to 30000.
  • For COCO, ${SPLIT} can be 32_all, 64_all, 128_all, 256_all, 512_all (representing 1/32, 1/64, 1/128, 1/256, and 1/512 splits), NUM_ITERATIONS should be set to 200000.

Supervised baseline

python train_sup.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}"

PseudoSeg (w/ unlabeled data)

python train_wss.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --train_split_cls="train_aug" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}"

PseudoSeg (w/ image-level labeled data)

python train_wss.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --train_split_cls="train_aug" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}" \
  --weakly=true

Evaluation

NOTE: ${EVAL_CROP_SIZE} should be 513,513 for VOC12, 641,641 for COCO.

python eval.py \
  --logtostderr \
  --eval_split="val" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --eval_crop_size="${EVAL_CROP_SIZE}" \
  --checkpoint_dir="${TRAIN_LOGDIR}" \
  --eval_logdir="${EVAL_LOGDIR}" \
  --dataset_dir="${DATASET}" \
  --max_number_of_evaluations=1

Visualization

NOTE: ${VIS_CROP_SIZE} should be 513,513 for VOC12, 641,641 for COCO.

python vis.py \
  --logtostderr \
  --vis_split="val" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --vis_crop_size="${VIS_CROP_SIZE}" \
  --checkpoint_dir="${CKPT}" \
  --vis_logdir="${VIS_LOGDIR}" \
  --dataset_dir="${PASCAL_DATASET}" \
  --also_save_raw_predictions=true

Citation

If you use this work for your research, please cite our paper.

@article{zou2020pseudoseg,
  title={PseudoSeg: Designing Pseudo Labels for Semantic Segmentation},
  author={Zou, Yuliang and Zhang, Zizhao and Zhang, Han and Li, Chun-Liang and Bian, Xiao and Huang, Jia-Bin and Pfister, Tomas},
  journal={International Conference on Learning Representations (ICLR)},
  year={2021}
}
Owner
Google Interns
Google Interns
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame

Yaroslava Lochman 55 Dec 30, 2022
City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

Aydin O'Leary 2 Mar 12, 2022
StyleGAN - Official TensorFlow Implementation

StyleGAN — Official TensorFlow Implementation Picture: These people are not real – they were produced by our generator that allows control over differ

NVIDIA Research Projects 13.1k Jan 09, 2023
Codebase to experiment with a hybrid Transformer that combines conditional sequence generation with regression

Regression Transformer Codebase to experiment with a hybrid Transformer that combines conditional sequence generation with regression . Development se

International Business Machines 27 Jan 05, 2023
[CVPR2021] Invertible Image Signal Processing

Invertible Image Signal Processing This repository includes official codes for "Invertible Image Signal Processing (CVPR2021)". Figure: Our framework

Yazhou XING 281 Dec 31, 2022
Deep learning toolbox based on PyTorch for hyperspectral data classification.

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Nicolas 304 Dec 28, 2022
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

CycleGAN PyTorch | project page | paper Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for

Jun-Yan Zhu 11.5k Dec 30, 2022
A PaddlePaddle implementation of STGCN with a few modifications in the model architecture in order to forecast traffic jam.

About This repository contains the code of a PaddlePaddle implementation of STGCN based on the paper Spatio-Temporal Graph Convolutional Networks: A D

Tianjian Li 1 Jan 11, 2022
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Here is deepparse. Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning. Use deepparse to Use the pr

GRAAL/GRAIL 192 Dec 20, 2022
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

Sungyoon Lee 4 Jul 12, 2022
Sharing of contents on mitochondrial encounter networks

mito-network-sharing Sharing of contents on mitochondrial encounter networks Required: R with igraph, brainGraph, ggplot2, and XML libraries; igraph l

Stochastic Biology Group 0 Oct 01, 2021
Large-Scale Unsupervised Object Discovery

Large-Scale Unsupervised Object Discovery Huy V. Vo, Elena Sizikova, Cordelia Schmid, Patrick Pérez, Jean Ponce [PDF] We propose a novel ranking-based

17 Sep 19, 2022
iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

CompVis Heidelberg 36 Dec 25, 2022
This repository contains code from the paper "TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network"

TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network This repository contains code from the paper "TTS-GAN: A Transformer-based Tim

Intelligent Multimodal Computing and Sensing Laboratory (IMICS Lab) - Texas State University 108 Dec 29, 2022
Export CenterPoint PonintPillars ONNX Model For TensorRT

CenterPoint-PonintPillars Pytroch model convert to ONNX and TensorRT Welcome to CenterPoint! This project is fork from tianweiy/CenterPoint. I impleme

CarkusL 149 Dec 13, 2022
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 05, 2023
YOLOv7 - Framework Beyond Detection

🔥🔥🔥🔥 YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥

JinTian 3k Jan 01, 2023
Evaluating deep transfer learning for whole-brain cognitive decoding

Evaluating deep transfer learning for whole-brain cognitive decoding This README file contains the following sections: Project description Repository

Armin Thomas 5 Oct 31, 2022
The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Targeted Neural Dynamical Modeling Note: This is a re-implementation (in Tensorflow2) of the original TNDM model. We do not plan to further update the

6 Oct 05, 2022
Vision Deep-Learning using Tensorflow, Keras.

Welcome! I am a computer vision deep learning developer working in Korea. This is my blog, and you can see everything I've studied here. https://www.n

kimminjun 6 Dec 14, 2022