Convert openmmlab (not only mmdetection) series model to tensorrt

Related tags

Deep Learningmm2trt
Overview

MMDet to TensorRT

This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is experiment.

support:

  • fp16
  • int8(experiment)
  • batched input
  • dynamic input shape
  • combination of different modules
  • deepstream support

Any advices, bug reports and stars are welcome.

License

This project is released under the Apache 2.0 license.

Requirement

  • install mmdetection:

    # mim is so cool!
    pip install openmim
    mim install mmdet==2.14.0
  • install torch2trt_dynamic:

    git clone https://github.com/grimoire/torch2trt_dynamic.git torch2trt_dynamic
    cd torch2trt_dynamic
    python setup.py develop
  • install amirstan_plugin:

    • Install tensorrt: TensorRT

    • clone repo and build plugin

      git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git
      cd amirstan_plugin
      git submodule update --init --progress --depth=1
      mkdir build
      cd build
      cmake -DTENSORRT_DIR=${TENSORRT_DIR} ..
      make -j10
    • DON'T FORGET setting the envoirment variable(in ~/.bashrc):

      export AMIRSTAN_LIBRARY_PATH=${amirstan_plugin_root}/build/lib

Installation

Host

git clone https://github.com/grimoire/mmdetection-to-tensorrt.git
cd mmdetection-to-tensorrt
python setup.py develop

Docker

Build docker image

# cuda11.1 TensorRT7.2.2 pytorch1.8 cuda11.1
sudo docker build -t mmdet2trt_docker:v1.0 docker/

You can also specify CUDA, Pytorch and Torchvision versions with docker build args by:

# cuda11.1 tensorrt7.2.2 pytorch1.6 cuda10.2
sudo docker build -t mmdet2trt_docker:v1.0 --build-arg TORCH_VERSION=1.6.0 --build-arg TORCHVISION_VERSION=0.7.0 --build-arg CUDA=10.2 --docker/

Run (will show the help for the CLI entrypoint)

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0

Or if you want to open a terminal inside de container:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} --entrypoint bash mmdet2trt_docker:v1.0

Example conversion:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0 ${bind_path}/config.py ${bind_path}/checkpoint.pth ${bind_path}/output.trt

Usage

how to create a TensorRT model from mmdet model (converting might take few minutes)(Might have some warning when converting.) detail can be found in getting_started.md

CLI

mmdet2trt ${CONFIG_PATH} ${CHECKPOINT_PATH} ${OUTPUT_PATH}

Run mmdet2trt -h for help on optional arguments.

Python

opt_shape_param=[
    [
        [1,3,320,320],      # min shape
        [1,3,800,1344],     # optimize shape
        [1,3,1344,1344],    # max shape
    ]
]
max_workspace_size=1<<30    # some module and tactic need large workspace.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)

# save converted model
torch.save(trt_model.state_dict(), save_model_path)

# save engine if you want to use it in c++ api
with open(save_engine_path, mode='wb') as f:
    f.write(trt_model.state_dict()['engine'])

Note:

  • The input of the engine is the tensor after preprocess.
  • The output of the engine is num_dets, bboxes, scores, class_ids. if you enable the enable_mask flag, there will be another output mask.
  • The bboxes output of the engine did not divided by scale factor.

how to use the converted model

from mmdet.apis import inference_detector
from mmdet2trt.apis import create_wrap_detector

# create wrap detector
trt_detector = create_wrap_detector(trt_model, cfg_path, device_id)

# result share same format as mmdetection
result = inference_detector(trt_detector, image_path)

# visualize
trt_detector.show_result(
    image_path,
    result,
    score_thr=score_thr,
    win_name='mmdet2trt',
    show=True)

Try demo in demo/inference.py, or demo/cpp if you want to do inference with c++ api.

Read getting_started.md for more details.

How does it works?

Most other project use pytorch=>ONNX=>tensorRT route, This repo convert pytorch=>tensorRT directly, avoid unnecessary ONNX IR. Read how-does-it-work for detail.

Support Model/Module

  • Faster R-CNN
  • Cascade R-CNN
  • Double-Head R-CNN
  • Group Normalization
  • Weight Standardization
  • DCN
  • SSD
  • RetinaNet
  • Libra R-CNN
  • FCOS
  • Fovea
  • CARAFE
  • FreeAnchor
  • RepPoints
  • NAS-FPN
  • ATSS
  • PAFPN
  • FSAF
  • GCNet
  • Guided Anchoring
  • Generalized Attention
  • Dynamic R-CNN
  • Hybrid Task Cascade
  • DetectoRS
  • Side-Aware Boundary Localization
  • YOLOv3
  • PAA
  • CornerNet(WIP)
  • Generalized Focal Loss
  • Grid RCNN
  • VFNet
  • GROIE
  • Mask R-CNN(experiment)
  • Cascade Mask R-CNN(experiment)
  • Cascade RPN
  • DETR
  • YOLOX

Tested on:

  • torch=1.8.1
  • tensorrt=8.0.1.6
  • mmdetection=2.18.0
  • cuda=11.1

If you find any error, please report it in the issue.

FAQ

read this page if you meet any problem.

Contact

This repo is maintained by @grimoire

Discuss group: QQ:1107959378

And send your resume to my e-mail if you want to join @OpenMMLab. Please read the JD for detail: link

Owner
JinTian
You know who I am.
JinTian
PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models

octconv.pytorch PyTorch implementation of Octave Convolution in Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octa

Duo Li 273 Dec 18, 2022
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022
[NeurIPS'20] Multiscale Deep Equilibrium Models

Multiscale Deep Equilibrium Models 💥 💥 💥 💥 This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simple

CMU Locus Lab 221 Dec 26, 2022
😮The official implementation of "CoNeRF: Controllable Neural Radiance Fields" 😮

CoNeRF: Controllable Neural Radiance Fields This is the official implementation for "CoNeRF: Controllable Neural Radiance Fields" Project Page Paper V

Kacper Kania 61 Dec 24, 2022
Pytorch implementation of RED-SDS (NeurIPS 2021).

Recurrent Explicit Duration Switching Dynamical Systems (RED-SDS) This repository contains a reference implementation of RED-SDS, a non-linear state s

Abdul Fatir 10 Dec 02, 2022
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Non-Metric Space Library (NMSLIB) Important Notes NMSLIB is generic but fast, see the results of ANN benchmarks. A standalone implementation of our fa

2.9k Jan 04, 2023
Sequential GCN for Active Learning

Sequential GCN for Active Learning Please cite if using the code: Link to paper. Requirements: python 3.6+ torch 1.0+ pip libraries: tqdm, sklearn, sc

45 Dec 26, 2022
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta

Han Zhang 1.8k Dec 21, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis 🙈 A more detailed re

Lincedo Lab 4 Jun 09, 2021
CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching(CVPR2021)

CFNet(CVPR 2021) This is the implementation of the paper CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching, CVPR 2021, Zhelun Shen, Yuch

106 Dec 28, 2022
DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

DeepLM DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021) Run Please install th

Jingwei Huang 130 Dec 02, 2022
Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

Xingdong Zuo 14 Dec 08, 2022
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021 [Projec

Zhengqi Li 583 Dec 30, 2022
Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images In this paper, we present an effective Dynamic Enhancement Anchor

13 Dec 09, 2022
Experiments with differentiable stacks and queues in PyTorch

Please use stacknn-core instead! StackNN This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in s

Will Merrill 141 Oct 06, 2022
FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

FusionNet_Pytorch FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics Requirements Pytorch 0.1.11 Pyt

Choi Gunho 102 Dec 13, 2022
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)

YOLOv5-GUI 🎉 YOLOv5算法(ver.6及ver.5)的Qt-GUI实现 🎉 Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5). 基于YOLOv5的v5版本和v6版本及Javacr大佬的UI逻辑进行编写

EricFang 12 Dec 28, 2022