A Simple and Versatile Framework for Object Detection and Instance Recognition

Overview

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition

Major Features

  • FP16 training for memory saving and up to 2.5X acceleration
  • Highly scalable distributed training available out of box
  • Full coverage of state-of-the-art models including FasterRCNN, MaskRCNN, CascadeRCNN, RetinaNet, DCNv1/v2, TridentNet, NASFPN , EfficientNet, and Knowledge Distillation
  • Extensive feature set including large batch BN, loss synchronization, automatic BN fusion, soft NMS, multi-scale train/test
  • Modular design for coding-free exploration of new experiment settings
  • Extensive documentations including annotated config, Fintuning Guide

Recent Updates

  • Add RPN test (2019.05.28)
  • Add NASFPN (2019.06.04)
  • Add new ResNetV1b baselines from GluonCV (2019.06.07)
  • Add Cascade R-CNN with FPN backbone (2019.06.11)
  • Speed up FPN up to 70% (2019.06.16)
  • Update NASFPN to include larger models (2019.07.01)
  • Automatic BN fusion for fixed BN training, saving up to 50% GPU memory (2019.07.04)
  • Speed up MaskRCNN by 80% (2019.07.23)
  • Update MaskRCNN baselines (2019.07.25)
  • Add EfficientNet and DCN (2019.08.06)
  • Add python wheel for easy local installation (2019.08.20)
  • Add FitNet based Knowledge Distill (2019.08.27)
  • Add SE and train from scratch (2019.08.30)
  • Add a lot of docs (2019.09.03)
  • Add support for INT8 training(contributed by Xiaotao Chen & Jingqiu Zhou) (2019.10.24)
  • Add support for FCOS(contributed by Zhen Wei) (2019.11)
  • Add support for Mask Scoring RCNN(contributed by Zehui Chen) (2019.12)
  • Add support for RepPoints(contributed by Bo Ke) (2020.02)
  • Add support for FreeAnchor (2020.03)
  • Add support for Feature Pyramid Grids & PAFPN (2020.06)
  • Add support for CrowdHuman Dataset (2020.06)
  • Add support for Double Pred (2020.06)
  • Add support for SEPC(contributed by Qiaofei Li) (2020.07)

Setup

All-in-one Script

We provide a setup script for install simpledet and preppare the coco dataset. If you use this script, you can skip to the Quick Start.

Install

We provide a conda installation here for Debian/Ubuntu system. To use a pre-built docker or singularity images, please refer to INSTALL.md for more information.

# install dependency
sudo apt update && sudo apt install -y git wget make python3-dev libglib2.0-0 libsm6 libxext6 libxrender-dev unzip

# create conda env
conda create -n simpledet python=3.7
conda activate simpledet

# fetch CUDA environment
conda install cudatoolkit=10.1

# install python dependency
pip install 'matplotlib<3.1' opencv-python pytz

# download and intall pre-built wheel for CUDA 10.1
pip install https://1dv.aflat.top/mxnet_cu101-1.6.0b20191214-py2.py3-none-manylinux1_x86_64.whl

# install pycocotools
pip install 'git+https://github.com/RogerChern/cocoapi.git#subdirectory=PythonAPI'

# install mxnext, a wrapper around MXNet symbolic API
pip install 'git+https://github.com/RogerChern/mxnext#egg=mxnext'

# get simpledet
git clone https://github.com/tusimple/simpledet
cd simpledet
make

# test simpledet installation
mkdir -p experiments/faster_r50v1_fpn_1x
python detection_infer_speed.py --config config/faster_r50v1_fpn_1x.py --shape 800 1333

If the last line execute successfully, the average running speed of Faster R-CNN R-50 FPN will be reported. And you have successfuly setup SimpleDet. Now you can head up to the next section to prepare your dataset.

Preparing Data

We provide a step by step preparation for the COCO dataset below.

cd simpledet

# make data dir
mkdir -p data/coco/images data/src

# skip this if you have the zip files
wget -c http://images.cocodataset.org/zips/train2017.zip -O data/src/train2017.zip
wget -c http://images.cocodataset.org/zips/val2017.zip -O data/src/val2017.zip
wget -c http://images.cocodataset.org/zips/test2017.zip -O data/src/test2017.zip
wget -c http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O data/src/annotations_trainval2017.zip
wget -c http://images.cocodataset.org/annotations/image_info_test2017.zip -O data/src/image_info_test2017.zip

unzip data/src/train2017.zip -d data/coco/images
unzip data/src/val2017.zip -d data/coco/images
unzip data/src/test2017.zip -d data/coco/images
unzip data/src/annotations_trainval2017.zip -d data/coco
unzip data/src/image_info_test2017.zip -d data/coco

python utils/create_coco_roidb.py --dataset coco --dataset-split train2017
python utils/create_coco_roidb.py --dataset coco --dataset-split val2017
python utils/create_coco_roidb.py --dataset coco --dataset-split test-dev2017

For other datasets or your own data, please check DATASET.md for more details.

Quick Start

# train
python detection_train.py --config config/faster_r50v1_fpn_1x.py

# test
python detection_test.py --config config/faster_r50v1_fpn_1x.py

Finetune

Please check FINTUNE.md

Model Zoo

Please refer to MODEL_ZOO.md for available models

Distributed Training

Please refer to DISTRIBUTED.md

Project Organization

Code Structure

detection_train.py
detection_test.py
config/
    detection_config.py
core/
    detection_input.py
    detection_metric.py
    detection_module.py
models/
    FPN/
    tridentnet/
    maskrcnn/
    cascade_rcnn/
    retinanet/
mxnext/
symbol/
    builder.py

Config

Everything is configurable from the config file, all the changes should be out of source.

Experiments

One experiment is a directory in experiments folder with the same name as the config file.

E.g. r50_fixbn_1x.py is the name of a config file

config/
    r50_fixbn_1x.py
experiments/
    r50_fixbn_1x/
        checkpoint.params
        log.txt
        coco_minival2014_result.json

Models

The models directory contains SOTA models implemented in SimpletDet.

How is Faster R-CNN built

Faster R-CNN

Simpledet supports many popular detection methods and here we take Faster R-CNN as a typical example to show how a detector is built.

  • Preprocessing. The preprocessing methods of the detector is implemented through DetectionAugmentation.
    • Image/bbox-related preprocessing, such as Norm2DImage and Resize2DImageBbox.
    • Anchor generator AnchorTarget2D, which generates anchors and corresponding anchor targets for training RPN.
  • Network Structure. The training and testing symbols of Faster-RCNN detector is defined in FasterRcnn. The key components are listed as follow:
    • Backbone. Backbone provides interfaces to build backbone networks, e.g. ResNet and ResNext.
    • Neck. Neck provides interfaces to build complementary feature extraction layers for backbone networks, e.g. FPNNeck builds Top-down pathway for Feature Pyramid Network.
    • RPN head. RpnHead aims to build classification and regression layers to generate proposal outputs for RPN. Meanwhile, it also provides interplace to generate sampled proposals for the subsequent R-CNN.
    • Roi Extractor. RoiExtractor extracts features for each roi (proposal) based on the R-CNN features generated by Backbone and Neck.
    • Bounding Box Head. BboxHead builds the R-CNN layers for proposal refinement.

How to build a custom detector

The flexibility of simpledet framework makes it easy to build different detectors. We take TridentNet as an example to demonstrate how to build a custom detector simply based on the Faster R-CNN framework.

  • Preprocessing. The additional processing methods could be provided accordingly by inheriting from DetectionAugmentation.
    • In TridentNet, a new TridentAnchorTarget2D is implemented to generate anchors for multiple branches and filter anchors for scale-aware training scheme.
  • Network Structure. The new network structure could be constructed easily for a custom detector by modifying some required components as needed and
    • For TridentNet, we build trident blocks in the Backbone according to the descriptions in the paper. We also provide a TridentRpnHead to generate filtered proposals in RPN to implement the scale-aware scheme. Other components are shared the same with original Faster-RCNN.

Contributors

Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Naiyan Wang, Xiaotao Chen, Jingqiu Zhou, Zhen Wei, Zehui Chen, Zhaoxiang Zhang, Bo Ke

License and Citation

This project is release under the Apache 2.0 license for non-commercial usage. For commercial usage, please contact us for another license.

If you find our project helpful, please consider cite our tech report.

@article{JMLR:v20:19-205,
  author  = {Yuntao Chen and Chenxia Han and Yanghao Li and Zehao Huang and Yi Jiang and Naiyan Wang and Zhaoxiang Zhang},
  title   = {SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition},
  journal = {Journal of Machine Learning Research},
  year    = {2019},
  volume  = {20},
  number  = {156},
  pages   = {1-8},
  url     = {http://jmlr.org/papers/v20/19-205.html}
}
Owner
TuSimple
The Future of Trucking
TuSimple
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to

Equinor 26 Dec 27, 2022
DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

DFFNet Paper DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation. Xiangyan Tang, Wenxuan Tu, Keqiu Li, J

4 Sep 23, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator

Phong Nguyen Ha 4 May 26, 2022
Classify the disease status of a plant given an image of a passion fruit

Passion Fruit Disease Detection I tried to create an accurate machine learning models capable of localizing and identifying multiple Passion Fruits in

3 Nov 09, 2021
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Kaggle-titanic This is a tutorial in an IPython Notebook for the Kaggle competition, Titanic Machine Learning From Disaster. The goal of this reposito

Andrew Conti 800 Dec 15, 2022
Tool for working with Y-chromosome data from YFull and FTDNA

ycomp ycomp is a tool for working with Y-chromosome data from YFull and FTDNA. Run ycomp -h for information on how to use the program. Installation Th

Alexander Regueiro 2 Jun 18, 2022
A pre-trained language model for social media text in Spanish

RoBERTuito A pre-trained language model for social media text in Spanish READ THE FULL PAPER Github Repository RoBERTuito is a pre-trained language mo

25 Dec 29, 2022
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
A complete, self-contained example for training ImageNet at state-of-the-art speed with FFCV

ffcv ImageNet Training A minimal, single-file PyTorch ImageNet training script designed for hackability. Run train_imagenet.py to get... ...high accur

FFCV 92 Dec 31, 2022
An 16kHz implementation of HiFi-GAN for soft-vc.

HiFi-GAN An 16kHz implementation of HiFi-GAN for soft-vc. Relevant links: Official HiFi-GAN repo HiFi-GAN paper Soft-VC repo Soft-VC paper Example Usa

Benjamin van Niekerk 42 Dec 27, 2022
Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

7 Jun 22, 2022
Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation

tf-imle Tensorflow 2 and PyTorch implementation and Jupyter notebooks for Implicit Maximum Likelihood Estimation (I-MLE) proposed in the NeurIPS 2021

NEC Laboratories Europe 69 Dec 13, 2022
Reimplement of SimSwap training code

SimSwap-train Reimplement of SimSwap training code Instructions 1.Environment Preparation (1)Refer to the README document of SIMSWAP to configure the

seeprettyface.com 111 Dec 31, 2022
Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph This repository provides a pipeline to create a knowledge graph from ra

AWS Samples 3 Jan 01, 2022
Implementation of Basic Machine Learning Algorithms on small datasets using Scikit Learn.

Basic Machine Learning Algorithms All the basic Machine Learning Algorithms are implemented in Python using libraries Acknowledgements Machine Learnin

Piyal Banik 47 Oct 16, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
[CVPR2021] De-rendering the World's Revolutionary Artefacts

De-rendering the World's Revolutionary Artefacts Project Page | Video | Paper In CVPR 2021 Shangzhe Wu1,4, Ameesh Makadia4, Jiajun Wu2, Noah Snavely4,

49 Nov 06, 2022
Voice Gender Recognition

In this project it was used some different Machine Learning models to identify the gender of a voice (Female or Male) based on some specific speech and voice attributes.

Anne Livia 1 Jan 27, 2022
Most popular metrics used to evaluate object detection algorithms.

Most popular metrics used to evaluate object detection algorithms.

Rafael Padilla 4.4k Dec 25, 2022