Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Overview

Training Script for Reuse-VOS

This code implementation of CVPR 2021 paper : Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Hard case (Ours, FRTM)

sample ours hard (Ours)

sample FRTM hard (FRTM)

Easy case (Ours, FRTM)

sample ours easy(Ours)

sample FRTM easy(FRTM)

Requirement

python package

  • torch
  • python-opencv
  • skimage
  • easydict

GPU support

  • GPU Memory >= 11GB (RN18)
  • CUDA >= 10.0
  • pytorch >= 1.4.0

Datasets

DAVIS

To test the DAVIS validation split, download and unzip the 2017 480p trainval images and annotations here.

/path/DAVIS
|-- Annotations/
|-- ImageSets/
|-- JPEGImages/

YouTubeVOS

To test our validation split and the YouTubeVOS challenge 'valid' split, download YouTubeVOS 2018 and place it in this directory structure:

/path/ytvos2018
|-- train/
|-- train_all_frames/
|-- valid/
`-- valid_all_frames/

Release

DAVIS

model Backbone Training set J & F 17 J & F 16 link
G-FRTM (t=1) Resnet18 Youtube-VOS + DAVIS 71.7 80.9 Google Drive
G-FRTM (t=0.7) Resnet18 Youtube-VOS + DAVIS 69.9 80.5 same pth
G-FRTM (t=1) Resnet101 Youtube-VOS + DAVIS 76.4 84.3 Google Drive
G-FRTM (t=0.7) Resnet101 Youtube-VOS + DAVIS 74.3 82.3 same pth

Youtube-VOS

model Backbone Training set G J-S J-Us F-S F-Us link
G-FRTM (t=1) Resnet18 Youtube-VOS 63.8 68.3 55.2 70.6 61.0 Google Drive
G-FRTM (t=0.8) Resnet18 Youtube-VOS 63.4 67.6 55.8 69.3 60.9 same pth
G-FRTM (t=0.7) Resnet18 Youtube-VOS 62.7 67.1 55.2 68.2 60.1 same pth

We initialize orignal-FRTM layers from official FRTM repository weight for Youtube-VOS benchmark. S = Seen, Us = Unseen

Target model cache

Here is the cache file we used for ResNet18 file

Run

Train

Open train.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python train.py --name <session-name> --ftext resnet18 --dset all --dev cuda:0

--name is the name of save_dir name of current train --ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2017, ytvos2018 or all ("all" really means "both"). --dev is the name of the device to train on. --m1 is the margin1 for training reuse gate, and we use 1.0 for DAVIS benchmark and 0.5 for Youtube-VOS benchmark. --m2 is the margin2 for training reuse gate, and we use 0.

Replace "session-name" with whatever you like. Subdirectories with this name will be created under your checkpoint and tensorboard paths.

Eval

Open eval.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python evaluate.py --ftext resnet18 --dset dv2017val --dev cuda:0

--ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2016val, dv2017val, yt2018jjval, yt2018val or yt2018valAll --dev is the name of the device to eval on. --TH Threshold for tau default= 0.7

The inference results will be saved at ${ROOT}/${result} . It is better to check multiple pth file for good accuracy.

Acknowledgement

This codebase borrows the code and structure from official FRTM repository. We are grateful to Facebook Inc. with valuable discussions.

Reference

The codebase is built based on following works

@misc{park2020learning,
      title={Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation}, 
      author={Hyojin Park and Jayeon Yoo and Seohyeong Jeong and Ganesh Venkatesh and Nojun Kwak},
      year={2020},
      eprint={2012.11655},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
HYOJINPARK
HYOJINPARK
Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

acLSTM_motion This folder contains an implementation of acRNN for the CMU motion database written in Pytorch. See the following links for more backgro

Yi_Zhou 61 Sep 07, 2022
Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

Adversarial Autoencoders (AAE) Tensorflow implementation of Adversarial Autoencoders (ICLR 2016) Similar to variational autoencoder (VAE), AAE imposes

Qian Ge 236 Nov 13, 2022
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

PySlowFast PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficie

Meta Research 5.3k Jan 03, 2023
Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching Official pytorch implementation of "Show, Attend and Distill: Kn

Clova AI Research 80 Dec 16, 2022
A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

##A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation. #USAGE To run the trained classifier on some images: python w

Alex Seewald 13 Nov 17, 2022
Implicit Graph Neural Networks

Implicit Graph Neural Networks This repository is the official PyTorch implementation of "Implicit Graph Neural Networks". Fangda Gu*, Heng Chang*, We

Heng Chang 48 Nov 29, 2022
Toolchain to build Yoshi's Island from source code

Project-Y Toolchain to build Yoshi's Island (J) V1.0 from source code, by MrL314 Last updated: September 17, 2021 Setup To begin, download this toolch

MrL314 19 Apr 18, 2022
Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network

ild-cnn This is supplementary material for the manuscript: "Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neur

22 Nov 05, 2022
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer This repository contains the PyTorch code for Evo-ViT. This work proposes a slow-fas

YifanXu 53 Dec 05, 2022
TensorFlow implementation of ENet, trained on the Cityscapes dataset.

segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e

Fredrik Gustafsson 248 Dec 16, 2022
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 04, 2023
Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.

Few-Shot-Intent-Detection Few-Shot-Intent-Detection is a repository designed for few-shot intent detection with/without Out-of-Scope (OOS) intents. It

Jian-Guo Zhang 73 Dec 26, 2022
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende

Sanjana Gunna 7 Aug 07, 2022
Efficiently computes derivatives of numpy code.

Note: Autograd is still being maintained but is no longer actively developed. The main developers (Dougal Maclaurin, David Duvenaud, Matt Johnson, and

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 6.1k Jan 08, 2023
BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构

BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构。 文档地址:https://basecls.readthedocs.io 安装 安装环境 BaseCls 需要 Python = 3.6。 BaseCls 依赖 M

MEGVII Research 28 Dec 23, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.

The repository contains the implementations for Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects. Model

Ankur Deria 115 Jan 06, 2023
MG-GCN: Scalable Multi-GPU GCN Training Framework

MG-GCN MG-GCN: multi-GPU GCN training framework. For more information, please read our paper. After cloning our repository, run git submodule update -

Translational Data Analytics (TDA) Lab @GaTech 6 Oct 24, 2022