SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

Overview

SOLO: Segmenting Objects by Locations

This project hosts the code for implementing the SOLO algorithms for instance segmentation.

SOLO: Segmenting Objects by Locations,
Xinlong Wang, Tao Kong, Chunhua Shen, Yuning Jiang, Lei Li
In: Proc. European Conference on Computer Vision (ECCV), 2020
arXiv preprint (arXiv 1912.04488)

SOLOv2: Dynamic and Fast Instance Segmentation,
Xinlong Wang, Rufeng Zhang, Tao Kong, Lei Li, Chunhua Shen
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2020
arXiv preprint (arXiv 2003.10152)

highlights

Highlights

  • Totally box-free: SOLO is totally box-free thus not being restricted by (anchor) box locations and scales, and naturally benefits from the inherent advantages of FCNs.
  • Direct instance segmentation: Our method takes an image as input, directly outputs instance masks and corresponding class probabilities, in a fully convolutional, box-free and grouping-free paradigm.
  • High-quality mask prediction: SOLOv2 is able to predict fine and detailed masks, especially at object boundaries.
  • State-of-the-art performance: Our best single model based on ResNet-101 and deformable convolutions achieves 41.7% in AP on COCO test-dev (without multi-scale testing). A light-weight version of SOLOv2 executes at 31.3 FPS on a single V100 GPU and yields 37.1% AP.

Updates

  • SOLOv2 implemented on detectron2 is released at adet. (07/12/20)
  • Training speeds up (~1.7x faster) for all models. (03/12/20)
  • SOLOv2 is available. Code and trained models of SOLOv2 are released. (08/07/2020)
  • Light-weight models and R101-based models are available. (31/03/2020)
  • SOLOv1 is available. Code and trained models of SOLO and Decoupled SOLO are released. (28/03/2020)

Installation

This implementation is based on mmdetection(v1.0.0). Please refer to INSTALL.md for installation and dataset preparation.

Models

For your convenience, we provide the following trained models on COCO (more models are coming soon). If you need the models in PaddlePaddle framework, please refer to paddlepaddle/README.md.

Model Multi-scale training Testing time / im AP (minival) Link
SOLO_R50_1x No 77ms 32.9 download
SOLO_R50_3x Yes 77ms 35.8 download
SOLO_R101_3x Yes 86ms 37.1 download
Decoupled_SOLO_R50_1x No 85ms 33.9 download
Decoupled_SOLO_R50_3x Yes 85ms 36.4 download
Decoupled_SOLO_R101_3x Yes 92ms 37.9 download
SOLOv2_R50_1x No 54ms 34.8 download
SOLOv2_R50_3x Yes 54ms 37.5 download
SOLOv2_R101_3x Yes 66ms 39.1 download
SOLOv2_R101_DCN_3x Yes 97ms 41.4 download
SOLOv2_X101_DCN_3x Yes 169ms 42.4 download

Light-weight models:

Model Multi-scale training Testing time / im AP (minival) Link
Decoupled_SOLO_Light_R50_3x Yes 29ms 33.0 download
Decoupled_SOLO_Light_DCN_R50_3x Yes 36ms 35.0 download
SOLOv2_Light_448_R18_3x Yes 19ms 29.6 download
SOLOv2_Light_448_R34_3x Yes 20ms 32.0 download
SOLOv2_Light_448_R50_3x Yes 24ms 33.7 download
SOLOv2_Light_512_DCN_R50_3x Yes 34ms 36.4 download

Disclaimer:

  • Light-weight means light-weight backbone, head and smaller input size. Please refer to the corresponding config files for details.
  • This is a reimplementation and the numbers are slightly different from our original paper (within 0.3% in mask AP).

Usage

A quick demo

Once the installation is done, you can download the provided models and use inference_demo.py to run a quick demo.

Train with multiple GPUs

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}

Example: 
./tools/dist_train.sh configs/solo/solo_r50_fpn_8gpu_1x.py  8

Train with single GPU

python tools/train.py ${CONFIG_FILE}

Example:
python tools/train.py configs/solo/solo_r50_fpn_8gpu_1x.py

Testing

# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM}  --show --out  ${OUTPUT_FILE} --eval segm

Example: 
./tools/dist_test.sh configs/solo/solo_r50_fpn_8gpu_1x.py SOLO_R50_1x.pth  8  --show --out results_solo.pkl --eval segm

# single-gpu testing
python tools/test_ins.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show --out  ${OUTPUT_FILE} --eval segm

Example: 
python tools/test_ins.py configs/solo/solo_r50_fpn_8gpu_1x.py  SOLO_R50_1x.pth --show --out  results_solo.pkl --eval segm

Visualization

python tools/test_ins_vis.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show --save_dir  ${SAVE_DIR}

Example: 
python tools/test_ins_vis.py configs/solo/solo_r50_fpn_8gpu_1x.py  SOLO_R50_1x.pth --show --save_dir  work_dirs/vis_solo

Contributing to the project

Any pull requests or issues are welcome.

Citations

Please consider citing our papers in your publications if the project helps your research. BibTeX reference is as follows.

@inproceedings{wang2020solo,
  title     =  {{SOLO}: Segmenting Objects by Locations},
  author    =  {Wang, Xinlong and Kong, Tao and Shen, Chunhua and Jiang, Yuning and Li, Lei},
  booktitle =  {Proc. Eur. Conf. Computer Vision (ECCV)},
  year      =  {2020}
}

@article{wang2020solov2,
  title={SOLOv2: Dynamic and Fast Instance Segmentation},
  author={Wang, Xinlong and Zhang, Rufeng and  Kong, Tao and Li, Lei and Shen, Chunhua},
  journal={Proc. Advances in Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact Xinlong Wang and Chunhua Shen.

Owner
Xinlong Wang
Xinlong Wang
Pytorch implementation of PCT: Point Cloud Transformer

PCT: Point Cloud Transformer This is a Pytorch implementation of PCT: Point Cloud Transformer.

Yi_Zhang 265 Dec 22, 2022
An end-to-end implementation of intent prediction with Metaflow and other cool tools

You Don't Need a Bigger Boat An end-to-end (Metaflow-based) implementation of an intent prediction flow for kids who can't MLOps good and wanna learn

Jacopo Tagliabue 614 Dec 31, 2022
Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes (CVPR 2021 Oral)

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Surfaces Official code release for NGLOD. For technical details, please refer t

659 Dec 27, 2022
A Python package for faster, safer, and simpler ML processes

Bender 🤖 A Python package for faster, safer, and simpler ML processes. Why use bender? Bender will make your machine learning processes, faster, safe

Otovo 6 Dec 13, 2022
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

HAOMO.AI 136 Jan 03, 2023
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

Shipeng Wang 34 Dec 21, 2022
Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads Inc 2 Jan 03, 2022
The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Enformer TPU training script (wip) The full training script for Enformer (Tensorflow Sonnet) on TPU clusters, in an effort to migrate the model to pyt

Phil Wang 10 Oct 19, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

Official code of APHYNITY Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting (ICLR 2021, Oral) Yuan Yin*, Vincent Le Guen*

Yuan Yin 24 Oct 24, 2022
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

郝翔 357 Jan 04, 2023
Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification Official Implementation for the pape

Anh M. Nguyen 36 Dec 28, 2022
[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini

Lihe Yang 66 Nov 29, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

JinTian 14 Aug 30, 2022
[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

CoRe Created by Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu, Jie Zhou This is the PyTorch implementation for ICCV paper Group-aware Contrastive

Xumin Yu 31 Dec 24, 2022
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

197 Nov 26, 2022
PyTorch implementation of EigenGAN

PyTorch Implementation of EigenGAN Train python train.py [image_folder_path] --name [experiment name] Test python test.py [ckpt path] --traverse FFH

62 Nov 12, 2022
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition

🎵 MuSiQue: Multi-hop Questions via Single-hop Question Composition This is the repository for our paper "MuSiQue: Multi-hop Questions via Single-hop

21 Jan 02, 2023
Agent-based model simulator for air quality and pandemic risk assessment in architectural spaces

Agent-based model simulation for air quality and pandemic risk assessment in architectural spaces. User Guide archABM is a fast and open source agent-

Vicomtech 10 Dec 05, 2022