Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Related tags

Deep Learninghsnet
Overview

PWC PWC PWC PWC PWC PWC PWC PWC

Hypercorrelation Squeeze for Few-Shot Segmentation

This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juhong Min, Dahyun Kang, and Minsu Cho. Implemented on Python 3.7 and Pytorch 1.5.1.

For more information, check out project [website] and the paper on [arXiv].

Requirements

  • Python 3.7
  • PyTorch 1.5.1
  • cuda 10.1
  • tensorboard 1.14

Conda environment settings:

conda create -n hsnet python=3.7
conda activate hsnet

conda install pytorch=1.5.1 torchvision cudatoolkit=10.1 -c pytorch
conda install -c conda-forge tensorflow
pip install tensorboardX

Preparing Few-Shot Segmentation Datasets

Download following datasets:

1. PASCAL-5i

Download PASCAL VOC2012 devkit (train/val data):

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar

Download PASCAL VOC2012 SDS extended mask annotations from our [Google Drive].

2. COCO-20i

Download COCO2014 train/val images and annotations:

wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip

Download COCO2014 train/val annotations from our Google Drive: [train2014.zip], [val2014.zip]. (and locate both train2014/ and val2014/ under annotations/ directory).

3. FSS-1000

Download FSS-1000 images and annotations from our [Google Drive].

Create a directory '../Datasets_HSN' for the above three few-shot segmentation datasets and appropriately place each dataset to have following directory structure:

../                         # parent directory
├── ./                      # current (project) directory
│   ├── common/             # (dir.) helper functions
│   ├── data/               # (dir.) dataloaders and splits for each FSSS dataset
│   ├── model/              # (dir.) implementation of Hypercorrelation Squeeze Network model 
│   ├── README.md           # intstruction for reproduction
│   ├── train.py            # code for training HSNet
│   └── test.py             # code for testing HSNet
└── Datasets_HSN/
    ├── VOC2012/            # PASCAL VOC2012 devkit
    │   ├── Annotations/
    │   ├── ImageSets/
    │   ├── ...
    │   └── SegmentationClassAug/
    ├── COCO2014/           
    │   ├── annotations/
    │   │   ├── train2014/  # (dir.) training masks (from Google Drive) 
    │   │   ├── val2014/    # (dir.) validation masks (from Google Drive)
    │   │   └── ..some json files..
    │   ├── train2014/
    │   └── val2014/
    └── FSS-1000/           # (dir.) contains 1000 object classes
        ├── abacus/   
        ├── ...
        └── zucchini/

Training

1. PASCAL-5i

python train.py --backbone {vgg16, resnet50, resnet101} 
                --fold {0, 1, 2, 3} 
                --benchmark pascal
                --lr 1e-3
                --bsz 20
                --load "path_to_trained_model/best_model.pt"
                --logpath "your_experiment_name"
  • Training takes approx. 2 days until convergence (trained with four 2080 Ti GPUs).

2. COCO-20i

python train.py --backbone {resnet50, resnet101} 
                --fold {0, 1, 2, 3} 
                --benchmark coco 
                --lr 1e-3
                --bsz 40
                --load "path_to_trained_model/best_model.pt"
                --logpath "your_experiment_name"
  • Training takes approx. 1 week until convergence (trained four Titan RTX GPUs).

3. FSS-1000

python train.py --backbone {vgg16, resnet50, resnet101} 
                --benchmark fss 
                --lr 1e-3
                --bsz 20
                --load "path_to_trained_model/best_model.pt"
                --logpath "your_experiment_name"
  • Training takes approx. 3 days until convergence (trained with four 2080 Ti GPUs).

Babysitting training:

Use tensorboard to babysit training progress:

  • For each experiment, a directory that logs training progress will be automatically generated under logs/ directory.
  • From terminal, run 'tensorboard --logdir logs/' to monitor the training progress.
  • Choose the best model when the validation (mIoU) curve starts to saturate.

Testing

1. PASCAL-5i

Pretrained models with tensorboard logs are available on our [Google Drive].

python test.py --backbone {vgg16, resnet50, resnet101} 
               --fold {0, 1, 2, 3} 
               --benchmark pascal
               --nshot {1, 5} 
               --load "path_to_trained_model/best_model.pt"

2. COCO-20i

Pretrained models with tensorboard logs are available on our [Google Drive].

python test.py --backbone {resnet50, resnet101} 
               --fold {0, 1, 2, 3} 
               --benchmark coco 
               --nshot {1, 5} 
               --load "path_to_trained_model/best_model.pt"

3. FSS-1000

Pretrained models with tensorboard logs are available on our [Google Drive].

python test.py --backbone {vgg16, resnet50, resnet101} 
               --benchmark fss 
               --nshot {1, 5} 
               --load "path_to_trained_model/best_model.pt"

4. Evaluation without support feature masking on PASCAL-5i

  • To reproduce the results in Tab.1 of our main paper, COMMENT OUT line 51 in hsnet.py: support_feats = self.mask_feature(support_feats, support_mask.clone())

Pretrained models with tensorboard logs are available on our [Google Drive].

python test.py --backbone resnet101 
               --fold {0, 1, 2, 3} 
               --benchmark pascal
               --nshot {1, 5} 
               --load "path_to_trained_model/best_model.pt"

Visualization

  • To visualize mask predictions, add command line argument --visualize: (prediction results will be saved under vis/ directory)
  python test.py '...other arguments...' --visualize  

Example qualitative results (1-shot):

BibTeX

If you use this code for your research, please consider citing:

@article{min2021hypercorrelation, 
   title={Hypercorrelation Squeeze for Few-Shot Segmentation},
   author={Juhong Min and Dahyun Kang and Minsu Cho},
   journal={arXiv preprint arXiv:2104.01538},
   year={2021}
}
Owner
Juhong Min
research interest in computer vision
Juhong Min
Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

About subwAI subwAI - a project for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation

82 Jan 01, 2023
Evaluating AlexNet features at various depths

Linear Separability Evaluation This repo provides the scripts to test a learned AlexNet's feature representation performance at the five different con

Yuki M. Asano 32 Dec 30, 2022
Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation

Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation This is the inference codes of Context-Aware Image Matting for Simultaneo

Qiqi Hou 125 Oct 22, 2022
A "gym" style toolkit for building lightweight Neural Architecture Search systems

A "gym" style toolkit for building lightweight Neural Architecture Search systems

Jack Turner 12 Nov 05, 2022
Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Wilson 1.7k Dec 30, 2022
Simultaneous Detection and Segmentation

Simultaneous Detection and Segmentation This is code for the ECCV Paper: Simultaneous Detection and Segmentation Bharath Hariharan, Pablo Arbelaez,

Bharath Hariharan 96 Jul 20, 2022
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample

Alias-Free-Torch Simple torch module implementation of Alias-Free GAN. This repository including Alias-Free GAN style lowpass sinc filter @filter.py A

이준혁(Junhyeok Lee) 64 Dec 22, 2022
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers Pytorch implementation of CvT: Introducing Convolutions to Vision Transformers Usage: img = torch

Rishikesh (ऋषिकेश) 193 Jan 03, 2023
Bayesian algorithm execution (BAX)

Bayesian Algorithm Execution (BAX) Code for the paper: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mut

Willie Neiswanger 38 Dec 08, 2022
[PNAS2021] The neural architecture of language: Integrative modeling converges on predictive processing

The neural architecture of language: Integrative modeling converges on predictive processing Code accompanying the paper The neural architecture of la

Martin Schrimpf 36 Dec 01, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF 🤮 : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Chen-Hsuan Lin 539 Dec 28, 2022
ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

PENet: Precise and Efficient Depth Completion This repo is the PyTorch implementation of our paper to appear in ICRA2021 on "Towards Precise and Effic

232 Dec 25, 2022
Code for the Image similarity challenge.

ISC 2021 This repository contains code for the Image Similarity Challenge 2021. Getting started The docs subdirectory has step-by-step instructions on

Facebook Research 173 Dec 12, 2022
Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

Learning Generative Models of Textured 3D Meshes from Real-World Images This is the reference implementation of "Learning Generative Models of Texture

Dario Pavllo 115 Jan 07, 2023
EEGEyeNet is benchmark to evaluate ET prediction based on EEG measurements with an increasing level of difficulty

Introduction EEGEyeNet EEGEyeNet is a benchmark to evaluate ET prediction based on EEG measurements with an increasing level of difficulty. Overview T

Ard Kastrati 23 Dec 22, 2022
A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Denoising Diffusion Probabilistic Model for Proteins Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to gen

Phil Wang 108 Nov 23, 2022
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue

Realtime Unsupervised Depth Estimation from an Image This is the caffe implementation of our paper "Unsupervised CNN for single view depth estimation:

Ravi Garg 227 Nov 28, 2022
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022