Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"

Related tags

Deep LearningJOINT
Overview

JOINT

This is the official implementation of Joint Inductive and Transductive learning for Video Object Segmentation, to appear in ICCV 2021.

@inproceedings{joint_iccv_2021,
  title={Joint Inductive and Transductive Learning for Video Object Segmentation},
  author={Yunyao Mao, Ning Wang, Wengang Zhou, Houqiang Li},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  year={2021}
}

JOINT overview figure

Installation

Clone this repository

git clone https://github.com/maoyunyao/JOINT.git

Install dependencies

Please check the detailed installation instructions.

Training

The whole network is trained with 8 NVIDIA GTX 1080Ti GPUs

conda activate pytracking
cd ltr
python run_training.py joint joint_stage1  # stage 1
python run_training.py joint joint_stage2  # stage 2

Note: We initialize the backbone ResNet with pre-trained Mask-RCNN weights as in LWL. These weights can be obtained from here. Before training, you need to download and save these weights in env_settings().pretrained_networks directory.

Evaluation

conda activate pytracking
cd pytracking
python run_tracker.py joint joint_davis --dataset_name dv2017_val        # DAVIS 2017 Val
python run_tracker.py joint joint_ytvos --dataset_name yt2018_valid_all  # YouTube-VOS 2018 Val
python run_tracker.py joint joint_ytvos --dataset_name yt2019_valid_all  # YouTube-VOS 2019 Val

Note: Before evaluation, the pretrained networks (see model zoo) should be downloaded and saved into the directory set by "network_path" in "pytracking/evaluation/local.py". By default, it is set to pytracking/networks.

Model Zoo

Models

Model YouTube-VOS 2018 (Overall Score) YouTube-VOS 2019 (Overall Score) DAVIS 2017 val (J&F score) Links Raw Results
JOINT_ytvos 83.1 82.8 -- model results
JOINT_davis -- -- 83.5 model results

Acknowledgments

  • Our JOINT segmentation tracker is implemented based on pytracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing such a great framework.
  • We adopt the few-shot learner proposed in LWL as the Induction branch.
Owner
Yunyao
A postgraduate student in University of Science and Technology of China
Yunyao
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 06, 2022
A Haskell kernel for IPython.

IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur

Andrew Gibiansky 2.4k Dec 29, 2022
BABEL: Bodies, Action and Behavior with English Labels [CVPR 2021]

BABEL is a large dataset with language labels describing the actions being performed in mocap sequences. BABEL labels about 43 hours of mocap sequences from AMASS [1] with action labels.

113 Dec 28, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

AlphaCare Background AlphaCare is a work-in-progress, open-source Deep Learning Engine for Healthcare that aims to treat and prevent major diseases. T

Siraj Raval 44 Nov 05, 2022
Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup

Lukas Hoyer 149 Dec 28, 2022
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Tianxiang Sun 149 Jan 04, 2023
A project for developing transformer-based models for clinical relation extraction

Clinical Relation Extration with Transformers Aim This package is developed for researchers easily to use state-of-the-art transformers models for ext

uf-hobi-informatics-lab 101 Dec 19, 2022
GPU-accelerated Image Processing library using OpenCL

pyclesperanto pyclesperanto is a python package for clEsperanto - a multi-language framework for GPU-accelerated image processing. clEsperanto uses Op

17 Dec 25, 2022
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

deargen 11 Nov 19, 2022
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)

A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the

Tirthajyoti Sarkar 223 Dec 05, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

Sayom Shakib 4 Nov 03, 2022
PyVideoAI: Action Recognition Framework

This reposity contains official implementation of: Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognitio

Kiyoon Kim 22 Dec 29, 2022
Auto-Encoding Score Distribution Regression for Action Quality Assessment

DAE-AQA It is an open source program reference to paper Auto-Encoding Score Distribution Regression for Action Quality Assessment. 1.Introduction DAE

13 Nov 16, 2022
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation This repository contains an implementation and further details of MoCoGAN: Decomposing Mo

Sergey Tulyakov 514 Dec 18, 2022
[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Just Ask: Learning to Answer Questions from Millions of Narrated Videos Webpage • Demo • Paper This repository provides the code for our paper, includ

Antoine Yang 87 Jan 05, 2023
Method for facial emotion recognition compitition of Xunfei and Datawhale .

人脸情绪识别挑战赛-第3名-W03KFgNOc-源代码、模型以及说明文档 队名:W03KFgNOc 排名:3 正确率: 0.75564 队员:yyMoming,xkwang,RichardoMu。 比赛链接:人脸情绪识别挑战赛 文章地址:link emotion 该项目分别训练八个模型并生成csv文

6 Oct 17, 2022
Interpretation of T cell states using reference single-cell atlases

Interpretation of T cell states using reference single-cell atlases ProjecTILs is a computational method to project scRNA-seq data into reference sing

Cancer Systems Immunology Lab 139 Jan 03, 2023