QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Overview

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

  • TL;DR: QueryInst is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation,

by Yuxin Fang*, Shusheng Yang*, Xinggang Wang†, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu.

(*) equal contribution, (†) corresponding author.

arXiv technical report (arXiv 2105.01928)

QueryInst

  • This repo serves as the official implementation for QueryInst, based on mmdetection and built upon Sparse R-CNN & DETR. Implantations based on Detectron2 will be released in the near future.

  • This project is under active development, we will extend QueryInst to a wide range of instance-level recognition tasks.

Updates

[06/05/2021] 🌟 QueryInst training and inference code has been released!

Getting Started

python setup.py develop
  • Prepare datasets:
mkdir data && cd data
ln -s /path/to/coco coco
  • Training QueryInst with single GPU:
python tools/train.py configs/queryinst/queryinst_r50_fpn_1x_coco.py
  • Training QueryInst with multi GPUs:
./tools/dist_train.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py 8
  • Test QueryInst on COCO val set with single GPU:
python tools/test.py configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth --eval bbox segm
  • Test QueryInst on COCO val set with multi GPUs:
./tools/dist_test.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth 8 --eval bbox segm

Main Results on COCO val

Configs Aug. Weights Box AP Mask AP
QueryInst_R50_3x_300_queries 480 ~ 800, w/ Crop - 46.9 41.4
QueryInst_R101_3x_300_queries 480 ~ 800, w/ Crop - 48.0 42.4
QueryInst_X101-DCN_3x_300_queries 480 ~ 800, w/ Crop - 50.3 44.2

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation ?? :

@article{QueryInst,
  title={QueryInst: Parallelly Supervised Mask Query for Instance Segmentation},
  author={Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
  journal={arXiv preprint arXiv:2105.01928},
  year={2021}
}

TODO

  • QueryInst training and inference code.
  • QueryInst based on Detectron2 toolbox will be released in the near future.
  • QueryInst configurations for Cityscapes and YouTube-VIS.
  • QueryInst pretrain weights.
Owner
Hust Visual Learning Team
Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST
Hust Visual Learning Team
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022
Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces Installation After cloning the repo open

37 Dec 03, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
Learning Visual Words for Weakly-Supervised Semantic Segmentation

[IJCAI 2021] Learning Visual Words for Weakly-Supervised Semantic Segmentation Implementation of IJCAI 2021 paper Learning Visual Words for Weakly-Sup

Lixiang Ru 24 Oct 05, 2022
[PNAS2021] The neural architecture of language: Integrative modeling converges on predictive processing

The neural architecture of language: Integrative modeling converges on predictive processing Code accompanying the paper The neural architecture of la

Martin Schrimpf 36 Dec 01, 2022
All public open-source implementations of convnets benchmarks

convnet-benchmarks Easy benchmarking of all public open-source implementations of convnets. A summary is provided in the section below. Machine: 6-cor

Soumith Chintala 2.7k Dec 30, 2022
Hummingbird compiles trained ML models into tensor computation for faster inference.

Hummingbird Introduction Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to se

Microsoft 3.1k Dec 30, 2022
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Created by Olga Sutyrina, Sarah Elemili, Abduragim Shtanchaev and Artur Bille Individual Tree Crown classification on WorldView-2 Images using Autoenc

2 Dec 08, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022
CPU inference engine that delivers unprecedented performance for sparse models

The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory b

Neural Magic 1.2k Jan 09, 2023
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
Python Algorithm Interview Book Review

파이썬 μ•Œκ³ λ¦¬μ¦˜ 인터뷰 μ±… 리뷰 리뷰 IT λŒ€κΈ°μ—…μ— λ“€μ–΄κ°€κ³  싢은 λͺ©ν‘œκ°€ μžˆλ‹€. λ‚΄κ°€ κΏˆκΏ”μ˜¨ νšŒμ‚¬μ—μ„œ μΌν•˜λŠ” μ‚¬λžŒλ“€μ˜ λͺ¨μŠ΅μ„ 보면 λ©‹μžˆλ‹€κ³  생각이 λ“€κ³  λ‚˜μ˜ λͺ©ν‘œμ— λŒ€ν•œ 열망이 κ°•ν•΄μ§€λŠ” 것 κ°™λ‹€. 미래의 핡심 사업 쀑 ν•˜λ‚˜μΈ SW 뢀뢄을 이끌고 λ°œμ „μ‹œν‚€λŠ” μš°λ¦¬λ‚˜λΌμ˜ I

SharkBSJ 1 Dec 14, 2021
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

248 Dec 04, 2022
InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision This is an official PyTorch implementation of the InsCLR paper. Download Dataset Dataset Im

Zelu Deng 25 Aug 30, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Automated Side Channel Analysis of Media Software with Manifold Learning Official implementation of USENIX Security 2022 paper: Automated Side Channel

Yuanyuan Yuan 175 Jan 07, 2023
Koopman operator identification library in Python

pykoop pykoop is a Koopman operator identification library written in Python. It allows the user to specify Koopman lifting functions and regressors i

DECAR Systems Group 34 Jan 04, 2023