Boundary-preserving Mask R-CNN (ECCV 2020)

Last update: Nov 28, 2022

Overview

BMaskR-CNN

This code is developed on Detectron2

Boundary-preserving Mask R-CNN
ECCV 2020
Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu

Video from Cam看世界 on Youtube.

Abstract

Tremendous efforts have been made to improve mask localization accuracy in instance segmentation. Modern instance segmentation methods relying on fully convolutional networks perform pixel-wise classification, which ignores object boundaries and shapes, leading coarse and indistinct mask prediction results and imprecise localization. To remedy these problems, we propose a conceptually simple yet effective Boundary-preserving Mask R-CNN (BMask R-CNN) to leverage object boundary information to improve mask localization accuracy. BMask R-CNN contains a boundary-preserving mask head in which object boundary and mask are mutually learned via feature fusion blocks. As a result,the mask prediction results are better aligned with object boundaries. Without bells and whistles, BMask R-CNN outperforms Mask R-CNN by a considerable margin on the COCO dataset; in the Cityscapes dataset,there are more accurate boundary groundtruths available, so that BMaskR-CNN obtains remarkable improvements over Mask R-CNN. Besides, it is not surprising to observe that BMask R-CNN obtains more obvious improvement when the evaluation criterion requires better localization (e.g., AP₇₅)

Models

COCO

Method	Backbone	lr sched	AP	AP₅₀	AP₇₅	AP_s	AP_m	AP_l	download
Mask R-CNN	R50-FPN	1x	35.2	56.3	37.5	17.2	37.7	50.3	-
PointRend	R50-FPN	1x	36.2	56.6	38.6	17.1	38.8	52.5	-
BMask R-CNN	R50-FPN	1x	36.6	56.7	39.4	17.3	38.8	53.8	model
BMask R-CNN	R101-FPN	1x	38.0	58.6	40.9	17.6	40.6	56.8	model
Cascade Mask R-CNN	R50-FPN	1x	36.4	56.9	39.2	17.5	38.7	52.5	-
Cascade BMask R-CNN	R50-FPN	1x	37.5	57.3	40.7	17.5	39.8	55.1	model
Cascade BMask R-CNN	R101-FPN	1x	39.1	59.2	42.4	18.6	42.2	57.4	model

Cityscapes

Initialized from ImagetNet pre-training.

Method	Backbone	lr sched	AP	download
PointRend	R50-FPN	1x	35.9	-
BMask R-CNN	R50-FPN	1x	36.2	model

Results

Left: AP curves of Mask R-CNN and BMask R-CNN under different mask IoU thresholds on the COCO val2017 set, the improvement becomes more significant when IoU increases. Right: Visualizations of Mask R-CNN and BMask R-CNN. BMask R-CNN can output more precise boundaries and accurate masks than Mask R-CNN.

Usage

Install Detectron2 following the official instructions

Training

specify a config file and train a model with 4 GPUs

cd projects/BMaskR-CNN
python train_net.py --config-file configs/bmask_rcnn_R_50_FPN_1x.yaml --num-gpus 4

Evaluation

specify a config file and test with trained model

cd projects/BMaskR-CNN
python train_net.py --config-file configs/bmask_rcnn_R_50_FPN_1x.yaml --num-gpus 4 --eval-only MODEL.WEIGHTS /path/to/model

Citation

@article{ChengWHL20,
  title={Boundary-preserving Mask R-CNN},
  author={Tianheng Cheng and Xinggang Wang and Lichao Huang and Wenyu Liu},
  booktitle={ECCV},
  year={2020}
}

Boundary-preserving Mask R-CNN (ECCV 2020)

Related tags

Overview

BMaskR-CNN

Abstract

Models

COCO

Cityscapes

Results

Usage

Training

Evaluation

Citation

Owner

Hust Visual Learning Team

Object detection and instance segmentation toolkit based on PaddlePaddle.

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Classification of ecg datas for disease detection

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Learning Features with Parameter-Free Layers (ICLR 2022)

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc.

The Noise Contrastive Estimation for softmax output written in Pytorch

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

An end-to-end PyTorch framework for image and video classification

✨✨✨An awesome open source toolbox for stereo matching.

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

This is a collection of all challenges in HKCERT CTF 2021

An open source app to help calm you down when needed.

E2VID_ROS - E2VID_ROS: E2VID to a real-time system

Unsupervised captioning - Code for Unsupervised Image Captioning

High performance distributed framework for training deep learning recommendation models based on PyTorch.