Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Overview

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021]

This is the official pytorch implementation of BCNet built on the open-source detectron2.

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers
Lei Ke, Yu-Wing Tai, Chi-Keung Tang
CVPR 2021

  • Two-stage instance segmentation with state-of-the-art performance.
  • Image formation as composition of two overlapping layers.
  • Bilayer decoupling for the occluder and occludee.
  • Efficacy on both the FCOS and Faster R-CNN detectors.

Under construction. Our code and pretrained model will be fully released in two months.

Visualization of Occluded Objects

Qualitative instance segmentation results of our BCNet, using ResNet-101-FPN and Faster R-CNN detector. The bottom row visualizes squared heatmap of contour and mask predictions by the two GCN layers for the occluder and occludee in the same ROI region specified by the red bounding box, which also makes the final segmentation result of BCNet more explainable than previous methods.

Qualitative instance segmentation results of our BCNet, using ResNet-101-FPN and FCOS detector.

Results on COCO test-dev

(Check Table 8 of the paper for full results, all methods are trained on COCO train2017)

Detector Backbone Method mAP(mask)
Faster R-CNN ResNet-50 FPN Mask R-CNN 34.2
Faster R-CNN ResNet-50 FPN MS R-CNN 35.6
Faster R-CNN ResNet-50 FPN PointRend 36.3
Faster R-CNN ResNet-50 FPN PANet 36.6
Faster R-CNN ResNet-50 FPN BCNet 38.4
Faster R-CNN ResNet-101 FPN Mask R-CNN 36.1
Faster R-CNN ResNet-101 FPN BMask R-CNN 37.7
Faster R-CNN ResNet-101 FPN MS R-CNN 38.3
Faster R-CNN ResNet-101 FPN BCNet 39.8, [Pretrained Model]
FCOS ResNet-101 FPN SipMask 37.8
FCOS ResNet-101 FPN BlendMask 38.4
FCOS ResNet-101 FPN CenterMask 38.3
FCOS ResNet-101 FPN BCNet 39.6, [Pretrained Model]

Introduction

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries. Unlike previous two-stage instance segmentation methods, BCNet models image formation as composition of two overlapping layers, where the top GCN layer detects the occluding objects (occluder) and the bottom GCN layer infers partially occluded instance (occludee). The explicit modeling of occlusion relationship with bilayer structure naturally decouples the boundaries of both the occluding and occluded instances, and considers the interaction between them during mask regression. We validate the efficacy of bilayer decoupling on both one-stage and two-stage object detectors with different backbones and network layer choices. The network of BCNet is as follows:

Step-by-step Installation

conda create -n bcnet python=3.7 -y
source activate bcnet
 
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
 
# FCOS and coco api and visualization dependencies
pip install ninja yacs cython matplotlib tqdm
pip install opencv-python==4.4.0.40
 
export INSTALL_DIR=$PWD
 
# install pycocotools. Please make sure you have installed cython.
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
 
# install BCNet
cd $INSTALL_DIR
git clone https://github.com/lkeab/BCNet.git
cd BCNet/
python3 setup.py build develop
 
unset INSTALL_DIR

Dataset Preparation

Prepare for coco2017 dataset following this instruction. And use our converted mask annotations to replace original annotation file for bilayer decoupling training.

  mkdir -p datasets/coco
  ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
  ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
  ln -s /path_to_coco_dataset/test2017 datasets/coco/test2017
  ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

Multi-GPU Training and evaluation on Validation set

bash all.sh

Or

CUDA_VISIBLE_DEVICES=0,1 python3 tools/train_net.py --num-gpus 2 \
	--config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml 2>&1 | tee log/train_log.txt

Pretrained Models

TBD

  mkdir pretrained_models
  #And put the downloaded pretrained models in this directory.

Testing on Test-dev

TBD

bash eval.sh

Citations

If you find BCNet useful in your research, please star this repository and consider citing:

@inproceedings{ke2021bcnet,
    author = {Ke, Lei and Tai, Yu-Wing and Tang, Chi-Keung},
    title = {Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers},
    booktitle = {CVPR},
    year = {2021},
}   

License

BCNet is released under the MIT license. See LICENSE for additional details. Thanks to the Third Party Libs detectron2

Owner
Lei Ke
PhD student in Computer Vision, HKUST
Lei Ke
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022
Official implementation of the ICML2021 paper "Elastic Graph Neural Networks"

ElasticGNN This repository includes the official implementation of ElasticGNN in the paper "Elastic Graph Neural Networks" [ICML 2021]. Xiaorui Liu, W

liuxiaorui 34 Dec 04, 2022
Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection abstract:Unlike 2D object detection where all RoI featur

DK. Zhang 2 Oct 07, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution

WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution This code belongs to the paper [1] available at https://arx

Fabian Altekrueger 5 Jun 02, 2022
The BCNet related data and inference model.

BCNet This repository includes the some source code and related dataset of paper BCNet: Learning Body and Cloth Shape from A Single Image, ECCV 2020,

81 Dec 12, 2022
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022
Py-FEAT: Python Facial Expression Analysis Toolbox

Py-FEAT is a suite for facial expressions (FEX) research written in Python. This package includes tools to detect faces, extract emotional facial expressions (e.g., happiness, sadness, anger), facial

Computational Social Affective Neuroscience Laboratory 147 Jan 06, 2023
🐦 Quickly annotate data from the comfort of your Jupyter notebook

🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort

Anastasis Germanidis 647 Jan 05, 2023
Fast, flexible and easy to use probabilistic modelling in Python.

Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic

Jacob Schreiber 3k Dec 29, 2022
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Microsoft 1.3k Dec 30, 2022
The end-to-end platform for building voice products at scale

Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog

Picovoice 318 Jan 07, 2023
Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

Flow Flow is a computational framework for deep RL and control experiments for traffic microsimulation. See our website for more information on the ap

867 Jan 02, 2023
RTSeg: Real-time Semantic Segmentation Comparative Study

Real-time Semantic Segmentation Comparative Study The repository contains the official TensorFlow code used in our papers: RTSEG: REAL-TIME SEMANTIC S

Mennatullah Siam 592 Nov 18, 2022
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
The source code and dataset for the RecGURU paper (WSDM 2022)

RecGURU About The Project Source code and baselines for the RecGURU paper "RecGURU: Adversarial Learning of Generalized User Representations for Cross

Chenglin Li 17 Jan 07, 2023
Self-Supervised Speech Pre-training and Representation Learning Toolkit.

What's New Sep 2021: We host a challenge in AAAI workshop: The 2nd Self-supervised Learning for Audio and Speech Processing! See SUPERB official site

s3prl 1.6k Jan 08, 2023
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

CNN-Filter-DB An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters Paul Gavrikov, Janis Keuper Paper: htt

Paul Gavrikov 18 Dec 30, 2022
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Yihui He 1k Jan 03, 2023