A Simple and Versatile Framework for Object Detection and Instance Recognition

Overview

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition

Major Features

  • FP16 training for memory saving and up to 2.5X acceleration
  • Highly scalable distributed training available out of box
  • Full coverage of state-of-the-art models including FasterRCNN, MaskRCNN, CascadeRCNN, RetinaNet, DCNv1/v2, TridentNet, NASFPN , EfficientNet, and Knowledge Distillation
  • Extensive feature set including large batch BN, loss synchronization, automatic BN fusion, soft NMS, multi-scale train/test
  • Modular design for coding-free exploration of new experiment settings
  • Extensive documentations including annotated config, Fintuning Guide

Recent Updates

  • Add RPN test (2019.05.28)
  • Add NASFPN (2019.06.04)
  • Add new ResNetV1b baselines from GluonCV (2019.06.07)
  • Add Cascade R-CNN with FPN backbone (2019.06.11)
  • Speed up FPN up to 70% (2019.06.16)
  • Update NASFPN to include larger models (2019.07.01)
  • Automatic BN fusion for fixed BN training, saving up to 50% GPU memory (2019.07.04)
  • Speed up MaskRCNN by 80% (2019.07.23)
  • Update MaskRCNN baselines (2019.07.25)
  • Add EfficientNet and DCN (2019.08.06)
  • Add python wheel for easy local installation (2019.08.20)
  • Add FitNet based Knowledge Distill (2019.08.27)
  • Add SE and train from scratch (2019.08.30)
  • Add a lot of docs (2019.09.03)
  • Add support for INT8 training(contributed by Xiaotao Chen & Jingqiu Zhou) (2019.10.24)
  • Add support for FCOS(contributed by Zhen Wei) (2019.11)
  • Add support for Mask Scoring RCNN(contributed by Zehui Chen) (2019.12)
  • Add support for RepPoints(contributed by Bo Ke) (2020.02)
  • Add support for FreeAnchor (2020.03)
  • Add support for Feature Pyramid Grids & PAFPN (2020.06)
  • Add support for CrowdHuman Dataset (2020.06)
  • Add support for Double Pred (2020.06)
  • Add support for SEPC(contributed by Qiaofei Li) (2020.07)

Setup

All-in-one Script

We provide a setup script for install simpledet and preppare the coco dataset. If you use this script, you can skip to the Quick Start.

Install

We provide a conda installation here for Debian/Ubuntu system. To use a pre-built docker or singularity images, please refer to INSTALL.md for more information.

# install dependency
sudo apt update && sudo apt install -y git wget make python3-dev libglib2.0-0 libsm6 libxext6 libxrender-dev unzip

# create conda env
conda create -n simpledet python=3.7
conda activate simpledet

# fetch CUDA environment
conda install cudatoolkit=10.1

# install python dependency
pip install 'matplotlib<3.1' opencv-python pytz

# download and intall pre-built wheel for CUDA 10.1
pip install https://1dv.aflat.top/mxnet_cu101-1.6.0b20191214-py2.py3-none-manylinux1_x86_64.whl

# install pycocotools
pip install 'git+https://github.com/RogerChern/cocoapi.git#subdirectory=PythonAPI'

# install mxnext, a wrapper around MXNet symbolic API
pip install 'git+https://github.com/RogerChern/mxnext#egg=mxnext'

# get simpledet
git clone https://github.com/tusimple/simpledet
cd simpledet
make

# test simpledet installation
mkdir -p experiments/faster_r50v1_fpn_1x
python detection_infer_speed.py --config config/faster_r50v1_fpn_1x.py --shape 800 1333

If the last line execute successfully, the average running speed of Faster R-CNN R-50 FPN will be reported. And you have successfuly setup SimpleDet. Now you can head up to the next section to prepare your dataset.

Preparing Data

We provide a step by step preparation for the COCO dataset below.

cd simpledet

# make data dir
mkdir -p data/coco/images data/src

# skip this if you have the zip files
wget -c http://images.cocodataset.org/zips/train2017.zip -O data/src/train2017.zip
wget -c http://images.cocodataset.org/zips/val2017.zip -O data/src/val2017.zip
wget -c http://images.cocodataset.org/zips/test2017.zip -O data/src/test2017.zip
wget -c http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O data/src/annotations_trainval2017.zip
wget -c http://images.cocodataset.org/annotations/image_info_test2017.zip -O data/src/image_info_test2017.zip

unzip data/src/train2017.zip -d data/coco/images
unzip data/src/val2017.zip -d data/coco/images
unzip data/src/test2017.zip -d data/coco/images
unzip data/src/annotations_trainval2017.zip -d data/coco
unzip data/src/image_info_test2017.zip -d data/coco

python utils/create_coco_roidb.py --dataset coco --dataset-split train2017
python utils/create_coco_roidb.py --dataset coco --dataset-split val2017
python utils/create_coco_roidb.py --dataset coco --dataset-split test-dev2017

For other datasets or your own data, please check DATASET.md for more details.

Quick Start

# train
python detection_train.py --config config/faster_r50v1_fpn_1x.py

# test
python detection_test.py --config config/faster_r50v1_fpn_1x.py

Finetune

Please check FINTUNE.md

Model Zoo

Please refer to MODEL_ZOO.md for available models

Distributed Training

Please refer to DISTRIBUTED.md

Project Organization

Code Structure

detection_train.py
detection_test.py
config/
    detection_config.py
core/
    detection_input.py
    detection_metric.py
    detection_module.py
models/
    FPN/
    tridentnet/
    maskrcnn/
    cascade_rcnn/
    retinanet/
mxnext/
symbol/
    builder.py

Config

Everything is configurable from the config file, all the changes should be out of source.

Experiments

One experiment is a directory in experiments folder with the same name as the config file.

E.g. r50_fixbn_1x.py is the name of a config file

config/
    r50_fixbn_1x.py
experiments/
    r50_fixbn_1x/
        checkpoint.params
        log.txt
        coco_minival2014_result.json

Models

The models directory contains SOTA models implemented in SimpletDet.

How is Faster R-CNN built

Faster R-CNN

Simpledet supports many popular detection methods and here we take Faster R-CNN as a typical example to show how a detector is built.

  • Preprocessing. The preprocessing methods of the detector is implemented through DetectionAugmentation.
    • Image/bbox-related preprocessing, such as Norm2DImage and Resize2DImageBbox.
    • Anchor generator AnchorTarget2D, which generates anchors and corresponding anchor targets for training RPN.
  • Network Structure. The training and testing symbols of Faster-RCNN detector is defined in FasterRcnn. The key components are listed as follow:
    • Backbone. Backbone provides interfaces to build backbone networks, e.g. ResNet and ResNext.
    • Neck. Neck provides interfaces to build complementary feature extraction layers for backbone networks, e.g. FPNNeck builds Top-down pathway for Feature Pyramid Network.
    • RPN head. RpnHead aims to build classification and regression layers to generate proposal outputs for RPN. Meanwhile, it also provides interplace to generate sampled proposals for the subsequent R-CNN.
    • Roi Extractor. RoiExtractor extracts features for each roi (proposal) based on the R-CNN features generated by Backbone and Neck.
    • Bounding Box Head. BboxHead builds the R-CNN layers for proposal refinement.

How to build a custom detector

The flexibility of simpledet framework makes it easy to build different detectors. We take TridentNet as an example to demonstrate how to build a custom detector simply based on the Faster R-CNN framework.

  • Preprocessing. The additional processing methods could be provided accordingly by inheriting from DetectionAugmentation.
    • In TridentNet, a new TridentAnchorTarget2D is implemented to generate anchors for multiple branches and filter anchors for scale-aware training scheme.
  • Network Structure. The new network structure could be constructed easily for a custom detector by modifying some required components as needed and
    • For TridentNet, we build trident blocks in the Backbone according to the descriptions in the paper. We also provide a TridentRpnHead to generate filtered proposals in RPN to implement the scale-aware scheme. Other components are shared the same with original Faster-RCNN.

Contributors

Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Naiyan Wang, Xiaotao Chen, Jingqiu Zhou, Zhen Wei, Zehui Chen, Zhaoxiang Zhang, Bo Ke

License and Citation

This project is release under the Apache 2.0 license for non-commercial usage. For commercial usage, please contact us for another license.

If you find our project helpful, please consider cite our tech report.

@article{JMLR:v20:19-205,
  author  = {Yuntao Chen and Chenxia Han and Yanghao Li and Zehao Huang and Yi Jiang and Naiyan Wang and Zhaoxiang Zhang},
  title   = {SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition},
  journal = {Journal of Machine Learning Research},
  year    = {2019},
  volume  = {20},
  number  = {156},
  pages   = {1-8},
  url     = {http://jmlr.org/papers/v20/19-205.html}
}
Owner
TuSimple
The Future of Trucking
TuSimple
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
Official implementation of "A Shared Representation for Photorealistic Driving Simulators" in PyTorch.

A Shared Representation for Photorealistic Driving Simulators The official code for the paper: "A Shared Representation for Photorealistic Driving Sim

VITA lab at EPFL 7 Oct 13, 2022
Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

PICASO Official PyTorch implemetation for the paper PICASO:Permutation-Invariant Cascaded Attentive Set Operator. Requirements Python 3 torch = 1.0 n

Samira Zare 0 Dec 23, 2021
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity, such as gratings, photonic-crystal slabs, metasurfaces, surf

Alex Song 17 Dec 19, 2022
A more easy-to-use implementation of KPConv

A more easy-to-use implementation of KPConv This repo contains a more easy-to-use implementation of KPConv based on PyTorch. Introduction KPConv is a

Zheng Qin 35 Dec 14, 2022
領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。 クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。 起動後はマウスクリック4

KazuhitoTakahashi 5 May 28, 2021
Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations Requirements The code is implemented in Python and requires

1 Nov 03, 2021
Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana

DeepGeneAnnotator: A tool to annotate the gene in the genome The master thesis of the "Using deep learning to predict gene structures of the coding ge

Ching-Tien Wang 3 Sep 09, 2022
[ICML 2020] "When Does Self-Supervision Help Graph Convolutional Networks?" by Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen

When Does Self-Supervision Help Graph Convolutional Networks? PyTorch implementation for When Does Self-Supervision Help Graph Convolutional Networks?

Shen Lab at Texas A&M University 106 Nov 11, 2022
Python3 / PyTorch implementation of the following paper: Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation. ICCV 2021 (oral)

FSRE-Depth This is a Python3 / PyTorch implementation of FSRE-Depth, as described in the following paper: Fine-grained Semantics-aware Representation

77 Dec 28, 2022
This code is an unofficial implementation of HiFiSinger.

HiFiSinger This code is an unofficial implementation of HiFiSinger. The algorithm is based on the following papers: Chen, J., Tan, X., Luan, J., Qin,

Heejo You 87 Dec 23, 2022
Trainable PyTorch reproduction of AlphaFold 2

OpenFold A faithful PyTorch reproduction of DeepMind's AlphaFold 2. Features OpenFold carefully reproduces (almost) all of the features of the origina

AQ Laboratory 1.7k Dec 29, 2022
An open-source, low-cost, image-based weed detection device for fallow scenarios.

Welcome to the OpenWeedLocator (OWL) project, an opensource hardware and software green-on-brown weed detector that uses entirely off-the-shelf compon

Guy Coleman 145 Jan 05, 2023
NeurIPS 2021, self-supervised 6D pose on category level

SE(3)-eSCOPE video | paper | website Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation Xiaolong Li, Yijia Weng,

Xiaolong 63 Nov 22, 2022
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation The code of: Context Decoupling Augmentation for Weakly Supervised Semanti

54 Dec 12, 2022
PyTorch Implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedding (ORAL, MICCAIW 2021)

Small Lesion Segmentation in Brain MRIs with Subpixel Embedding PyTorch implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedd

22 Oct 21, 2022
Recognize numbers from an (28 x 28) image using neural networks

Number recognition Recognize numbers from a 28 x 28 image using neural networks Usage This is an example of a simple usage of number-recognition NOTE:

Mauro Baladés 2 Dec 29, 2021
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
Tiny Kinetics-400 for test

Kinetics-400迷你数据集 English | 简体中文 该数据集旨在解决的问题:参照Kinetics-400数据格式,训练基于自己数据的视频理解模型。 数据集介绍 Kinetics-400是视频领域benchmark常用数据集,详细介绍可以参考其官方网站Kinetics。整个数据集包含40

38 Jan 06, 2023