Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Overview

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Pytorch implementation for "Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity" (CVPR 2022, link TBD) by Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, and Du Tran. We propose a framework for open-world instance segmentation, Generic Grouping Network (GGN), which exploits pseudo Ground Truth training strategy. On the same backbone, GGN produces impressive AR gains compared to closed-world training on cross-category generalization (+11% VOC to Non-VOC) and cross-dataset generalization (+5.2% COCO to UVO).

What is it? Open-world instance segmentation requires a model to group pixels into object instances without a pre-defined taxonomy, that is, both "seen" categories (those present during training) and "unseen" categories (not seen during training). There is generally a large performance gap between the seen and unseen domains. For example, a baseline Mask R-CNN miss 15 annotated masks in the example below. Without additional training data or annotations, Mask R-CNN trained with GGN framework produces 9 more segments correctly, being much closer to ground truth annotations.

How we do it? Our approach first learns a pairwise affinity predictor that captures correctly if two pixels belong to same instance or not. We demonstrate such pairwise affinity representation generalizes well to unseen domains. We then use a grouping module (e.g. MCG) to extract and rank segments from predicted PA. We can run this on any image dataset without using annotations; we extract highest ranked segments as "pseudo ground truth" candidate masks. This is a large and category-agnostic set; we add it to our (much smaller) datasets of curated annotations to train a detector.


About the code. This repo is built based on mmdetection with the addition of OLN backbone (concurrent work). The repo is tested under Python 3.7, PyTorch 1.7.0, Cuda 11.0, and mmcv==1.2.5. We thank authors of OLN for releasing their work to facilitate research.

Model zoo

Below we release PA predictor models, pseudo-GT generated by PA predictors and GGN trained with both annotated-GT and pseudo-GT. We also release some of the processed annotations from LVIS to conduct cross-category generalization experiments.

Training Eval url Baseline AR GGN AR Top-K Pseudo
Person, COCO Non-Person, COCO PA/Pseudo/GGN 4.9 20.9 3
VOC, COCO Non-VOC, COCO PA/Pseudo/Pseudo-OLN/ GGN/GGN-OLN 19.9 28.7 (33.7 with OLN) 3
COCO, LVIS Non-COCO, LVIS PA/Pseudo/GGN 16.5 20.4 1
Non-COCO, LVIS COCO PA/Pseudo/GGN 21.7 23.6 1
COCO UVO PA/Pseudo/GGN 40.1 43.4 3
COCO, random init ImageNet PA/Pseudo/GGN 10

We remark using large-scale pre-training in the last row as initialization and finetune GGN on COCO with pseudo-GT on COCO gives further improvement (45.3 on UVO), with model.

Installation

This repo is built based on mmdetection.

You can use following commands to create conda env with related dependencies.

conda create -n ggn python=3.7 -y
conda activate ggn
conda install pytorch=1.7.0 torchvision cudatoolkit=11.0 -c pytorch -y
pip install mmcv-full
pip install -r requirements.txt
pip install -v -e .

Please also refer to get_started.md for more details of installation.

Next you will need to build the library for our grouping module:

cd pa_lib/cython_lib
python3 setup.py build_ext --inplace

Data Preparation

Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Our work also uses LVIS, UVO and ADE20K. To use ADE20K, please convert them into COCO-style annotations.

Training of pairwise affinity predictor

bash tools/dist_train.sh configs/pairwise_affinity/pa_train.py ${NUM_GPUS} --work-dir ${WORK_DIR}

Test PA

We provide a tool tools/test_pa.py to directly evaluate PA performance (e.g. on PA prediction and on grouped masks).

python tools/test_pa.py configs/pairwise_affinity/pa_train.py ${WORK_DIR}/latest.pth --eval pa --eval-proposals --test-partition nonvoc

Extracting pseudo-GT masks

We first begin by extracting masks. Example config pa_extract.py extracts pseudo-GT masks from PA trained on VOC subsets of COCO. use-gt-masks flag asks the pipeline to compute maximum IoU an extracted masks has with the GT. It is recommended to split the dataset into multiple shards to run extractions. On original image resolution and Nvidia V100 machine, it takes about 4.8s per image to run the full pipeline (compute PA, run grouping, ranking then compute IoU with annotated GT) without globalization and trained ranker or 10s with globalization and trained ranker.

python tools/extract_pa_masks.py configs/pairwise_affinity/pa_extract.py ${PA_MODEL_PATH} --out ${OUT_DIR}/masks.json --use-gt-masks 1

The extracted masks will be stored in JSON with the following format

[
  [segm1, segm2,..., segm20] ## Result of an image
  ...
]

We refer to tools/merge_annotations.py for reference on formatting the extracted masks as a new COCO-style annotation file. We remark that tools/interpolate_extracted_masks.py may be necessary if not running extraction on original image resolution.

Training of GGN

Please specify additional_ann_file with the extracted pseudo-GT in previous step in class_agn_mask_rcnn_pa.py.

bash tools/dist_train.sh configs/mask_rcnn/class_agn_mask_rcnn_pa.py ${NUM_GPUS}

class_agn_mask_rcnn_gn_online.py is used to train ImageNet extracted masks since there are too many annotations and we cannot store everything in a single json file without OOM. We will need to break it into per-image annotations in the format of "{image_id}.json".

Testing

python tools/test.py configs/mask_rcnn/class_agn_mask_rcnn.py ${WORK_DIR}/latest.pth --eval segm

To cite this work

@article{wang2022ggn,
  title={Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity},
  author={Wang, Weiyao and Feiszli, Matt and Wang, Heng and Malik, Jitendra and Tran, Du},
  journal={CVPR},
  year={2022}
}

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Owner
Meta Research
Meta Research
AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

AirPose AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation Check the teaser video This repository contains the code of A

Robot Perception Group 41 Dec 05, 2022
Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

Yunho Kim 4 Oct 20, 2022
Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

Julia Gusak 25 Aug 12, 2021
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Structure Inpainting ArXiv | Papar | Supplementary Material | BibTex This repository is for the CVPR 2021 paper, "Generating Diverse Structure

152 Nov 04, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
PyTorch implementation of 1712.06087 "Zero-Shot" Super-Resolution using Deep Internal Learning

Unofficial PyTorch implementation of "Zero-Shot" Super-Resolution using Deep Internal Learning Unofficial Implementation of 1712.06087 "Zero-Shot" Sup

Jacob Gildenblat 196 Nov 27, 2022
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

haifeng xia 32 Oct 26, 2022
Position detection system of mobile robot in the warehouse enviroment

Autonomous-Forklift-System About | GUI | Tests | Starting | License | Author | 🎯 About An application that run the autonomous forklift paletization a

Kamil Goś 1 Nov 24, 2021
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 09, 2023
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation This is the unofficial PyTorch Implementation of "Augment

DK 20 Sep 09, 2022
Scalable Multi-Agent Reinforcement Learning

Scalable Multi-Agent Reinforcement Learning 1. Featured algorithms: Value Function Factorization with Variable Agent Sub-Teams (VAST) [1] 2. Implement

3 Aug 02, 2022
Unsupervised Representation Learning by Invariance Propagation

Unsupervised Learning by Invariance Propagation This repository is the official implementation of Unsupervised Learning by Invariance Propagation. Pre

FengWang 15 Jul 06, 2022
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
Machine learning, in numpy

numpy-ml Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in NumPy? No? Install

David Bourgin 11.6k Dec 30, 2022
This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

<a href=[email protected]"> 4 Nov 06, 2022
Mscp jamf - Build compliance in jamf

mscp_jamf Build compliance in Jamf. This will build the following xml pieces to

Bob Gendler 3 Jul 25, 2022
Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

This project builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques.

20 Dec 30, 2022
Human-Pose-and-Motion History

Human Pose and Motion Scientist Approach Eadweard Muybridge, The Galloping Horse Portfolio, 1887 Etienne-Jules Marey, Descent of Inclined Plane, Chron

Daito Manabe 47 Dec 16, 2022