implement of SwiftNet:Real-time Video Object Segmentation

Last update: Dec 14, 2022

Related tags

Overview

SwiftNet

The official PyTorch implementation of SwiftNet:Real-time Video Object Segmentation, which has been accepted by CVPR2021.

Requirements

Python >= 3.6
Pytorch 1.5
Numpy
Pillow
opencv-python
scipy
tqdm

Training

The training pipeline of Swiftnet is similar with the training pipeline of STM, which can be found in our reproduced STM training code.

Inference

Usage

python eval.py -g 0 -y 17 -s val -D 'path to davis'

Performance

Performance on Davis-17 val set.

backbone	J&F	J	F	FPS	weights
resnet-18	77.6	75.5	79.7	65	`link`

Note: The FPS is tested on one P100, which does not include the time of image loading and evaluation cost.

Acknowledgement

This repository is partially founded on the official STM repository.

Citation

If you find this repository helpful and want to cite SwiftNet in your own projects, please use the following citation info.

@inproceedings{wang2021swiftnet,
  title={SwiftNet: Real-time Video Object Segmentation},
  author={Wang, Haochen and Jiang, Xiaolong and Ren, Haibing and Hu, Yao and Bai, Song},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1296--1305},
  year={2021}
}

implement of SwiftNet:Real-time Video Object Segmentation

Related tags

Overview

SwiftNet

Requirements

Training

Inference

Performance

Acknowledgement

Citation

Owner

haochen wang

Multi-query Video Retreival

[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

Classification Modeling: Probability of Default

Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel

Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

As-ViT: Auto-scaling Vision Transformers without Training

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Python inverse kinematics for your robot model based on Pinocchio.

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

This is the official github repository of the Met dataset

3D Generative Adversarial Network

NAVER BoostCamp Final Project

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.