OpenMMLab Model Deployment Toolset

Overview

docs badge codecov license issue resolution open issues

Introduction

English | 简体中文

MMDeploy is an open-source deep learning model deployment toolset. It is a part of the OpenMMLab project.

Major features

  • Fully support OpenMMLab models

    We provide a unified model deployment toolbox for the codebases in OpenMMLab. The supported codebases are listed as below, and more will be added in the future

    • MMClassification
    • MMDetection
    • MMSegmentation
    • MMEditing
    • MMOCR
  • Multiple inference backends are available

    Models can be exported and run in different backends. The following ones are supported, and more will be taken into consideration

    • ONNX Runtime
    • TensorRT
    • PPLNN
    • ncnn
    • OpenVINO
  • Efficient and highly scalable SDK Framework by C/C++

    All kinds of modules in SDK can be extensible, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on

License

This project is released under the Apache 2.0 license.

Installation

Please refer to build.md for installation.

Getting Started

Please see getting_started.md for the basic usage of MMDeploy. We also provide other tutorials for:

Please refer to FAQ for frequently asked questions.

Benchmark and model zoo

Results and supported model list are available in the benchmark and model list.

Contributing

We appreciate all contributions to improve MMDeploy. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

We would like to thank OpenVINO team, for their remarkable efforts to export MMDetection models to OpenVINO and integrate OpenVINO into MMDeploy backends

Citation

If you find this project useful in your research, please consider cite:

@misc{=mmdeploy,
    title={OpenMMLab's Model Deployment Toolbox.},
    author={MMDeploy Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmdeploy}},
    year={2021}
}

Projects in OpenMMLab

  • MMCV: OpenMMLab foundational library for computer vision.
  • MIM: MIM Installs OpenMMLab Packages.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
  • MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding.
  • MMGeneration: OpenMMLab image and video generative models toolbox.
  • MMFlow: OpenMMLab optical flow toolbox and benchmark.
  • MMFewShot: OpenMMLab FewShot Learning Toolbox and Benchmark.
  • MMHuman3D: OpenMMLab Human Pose and Shape Estimation Toolbox and Benchmark.
  • MMSelfSup: OpenMMLab self-supervised learning Toolbox and Benchmark.
  • MMRazor: OpenMMLab Model Compression Toolbox and Benchmark.
Comments
  • Score/confidence of prediction drop a lot after convert to trt engine

    Score/confidence of prediction drop a lot after convert to trt engine

    I am using Dyhead to train an image detection model: https://github.com/open-mmlab/mmdetection/blob/master/configs/dyhead/atss_swin-l-p4-w12_fpn_dyhead_mstrain_2x_coco.py

    Using GPU docker, convert to tensorrt with tools/deploy.py success: python3 tools/deploy.py /workdir/detection_onnx_static_1024x1024.py /workdir/atss_swin-l-p4-w12_fpn_dyhead_mstrain_2x.py /workdir/latest.pth /workdir/test-deploy-img-1024.jpg --device cuda --dump-info

    Although the conversion have a lot warning like below:

    /root/workspace/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      ys_shape = tuple(int(s) for s in ys.shape)
    /root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/base.py:24: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
      img_shape = [int(val) for val in img_shape]
    /root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/backbones.py:202: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
      slice_w = (W + self.window_size - 1) // self.window_size * self.window_size
    WARNING: The shape inference of mmdeploy::MMCVModulatedDeformConv2d type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTInstanceNormalization type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    [09/01/2022-01:33:56] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +809, GPU +348, now: CPU 3460, GPU 785 (MiB)
    [09/01/2022-01:33:56] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +126, GPU +60, now: CPU 3586, GPU 845 (MiB)
    [09/01/2022-01:33:56] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.3.2
    [09/01/2022-01:33:56] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
    [09/01/2022-01:35:06] [TRT] [I] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes.
    [09/01/2022-01:36:46] [TRT] [I] Detected 1 inputs and 2 output network tensors.
    [09/01/2022-01:36:48] [TRT] [I] Total Host Persistent Memory: 500480
    [09/01/2022-01:36:48] [TRT] [I] Total Device Persistent Memory: 393728
    [09/01/2022-01:36:48] [TRT] [I] Total Scratch Memory: 1918222336
    [09/01/2022-01:36:48] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 0 MiB
    [09/01/2022-01:36:53] [TRT] [I] [BlockAssignment] Algorithm ShiftNTopDown took 5282.29ms to assign 35 blocks to 1075 nodes requiring 2039239680 bytes.
    [09/01/2022-01:36:53] [TRT] [I] Total Activation Memory: 2039239680
    [09/01/2022-01:36:53] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5632, GPU 2311 (MiB)
    [09/01/2022-01:36:53] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 5632, GPU 2319 (MiB)
    [09/01/2022-01:36:53] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.3.2
    [09/01/2022-01:36:53] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +0, GPU +0, now: CPU 0, GPU 0 (MiB)
    [09/01/2022-01:36:53] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
    [09/01/2022-01:36:53] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
    2022-09-01 01:36:54,590 - mmdeploy - INFO - Finish pipeline mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt
    2022-09-01 01:36:55,388 - mmdeploy - WARNING - "visualize_model" has been skipped may be because it's             running on a headless device.
    2022-09-01 01:36:55,388 - mmdeploy - INFO - All process success.
    

    I know after conversion the result is not exactly the same so I am okey with bounding box value difference(although its off quite a bit too), but the score is kind of drop too much! Below is the trt engine result (x1,y1,x2,y2,score) [8.8506012, 358.41714, 149.80162, 495.56137, 0.081301391] And below is the original predict with mmdet, the bbox have been round down [0, 328, 165, 526, 0.53286]

    Here is my env with python3 tools/check_env.py

    2022-09-01 02:08:19,488 - mmdeploy - INFO - 
    
    2022-09-01 02:08:19,489 - mmdeploy - INFO - **********Environmental information**********
    2022-09-01 02:08:19,681 - mmdeploy - INFO - sys.platform: linux
    2022-09-01 02:08:19,681 - mmdeploy - INFO - Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
    2022-09-01 02:08:19,681 - mmdeploy - INFO - CUDA available: True
    2022-09-01 02:08:19,681 - mmdeploy - INFO - GPU 0: NVIDIA RTX A4000
    2022-09-01 02:08:19,681 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
    2022-09-01 02:08:19,681 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.124
    2022-09-01 02:08:19,681 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    2022-09-01 02:08:19,681 - mmdeploy - INFO - PyTorch: 1.12.0
    2022-09-01 02:08:19,681 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
      - GCC 9.3
      - C++ Version: 201402
      - Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - LAPACK is enabled (usually provided by MKL)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 11.6
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
      - CuDNN 8.3.2  (built against CUDA 11.5)
      - Magma 2.6.1
      - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 
    
    2022-09-01 02:08:19,681 - mmdeploy - INFO - TorchVision: 0.13.0
    2022-09-01 02:08:19,681 - mmdeploy - INFO - OpenCV: 4.6.0
    2022-09-01 02:08:19,681 - mmdeploy - INFO - MMCV: 1.6.1
    2022-09-01 02:08:19,681 - mmdeploy - INFO - MMCV Compiler: GCC 9.3
    2022-09-01 02:08:19,681 - mmdeploy - INFO - MMCV CUDA Compiler: 11.6
    2022-09-01 02:08:19,681 - mmdeploy - INFO - MMDeploy: 0.7.0+21775ce
    2022-09-01 02:08:19,681 - mmdeploy - INFO - 
    
    2022-09-01 02:08:19,681 - mmdeploy - INFO - **********Backend information**********
    2022-09-01 02:08:20,066 - mmdeploy - INFO - onnxruntime: 1.8.1	ops_is_avaliable : True
    2022-09-01 02:08:20,091 - mmdeploy - INFO - tensorrt: 8.4.3.1	ops_is_avaliable : True
    2022-09-01 02:08:20,105 - mmdeploy - INFO - ncnn: None	ops_is_avaliable : False
    2022-09-01 02:08:20,106 - mmdeploy - INFO - pplnn_is_avaliable: False
    2022-09-01 02:08:20,107 - mmdeploy - INFO - openvino_is_avaliable: False
    2022-09-01 02:08:20,121 - mmdeploy - INFO - snpe_is_available: False
    2022-09-01 02:08:20,121 - mmdeploy - INFO - 
    
    2022-09-01 02:08:20,121 - mmdeploy - INFO - **********Codebase information**********
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmdet:	2.25.1
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmseg:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmcls:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmocr:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmedit:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmdet3d:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmpose:	None
    2022-09-01 02:08:20,122 - mmdeploy - INFO - mmrotate:	None
    

    And here is my deploy config (/workdir/detection_onnx_static_1024x1024.py):

    codebase_config = dict(
        type='mmdet',
        task='ObjectDetection',
        model_type='end2end',
        post_processing=dict(
            score_threshold=0.05,
            confidence_threshold=0.005,  # for YOLOv3
            iou_threshold=0.5,
            max_output_boxes_per_class=200,
            pre_top_k=5000,
            keep_top_k=100,
            background_label_id=-1,
        )
    )
    
    onnx_config = dict(
        type='onnx',
        export_params=True,
        keep_initializers_as_inputs=False,
        opset_version=11,
        save_file='dyhead_swin_1024.onnx',
        input_names=['input'],
        output_names=['dets', 'labels'],
        input_shape=[1024, 1024],
        optimize=True
    )
    
    backend_config = dict(
            type='tensorrt',
            common_config=dict(fp16_mode=False, max_workspace_size=8 << 30),
            model_inputs=[
                dict(
                    input_shapes=dict(
                        input=dict(min_shape=[1, 3, 1024, 1024],
                            opt_shape=[1, 3, 1024, 1024],
                            max_shape=[1, 3, 1024, 1024]
                        )
                    )
                )
            ]
    )
    

    I also try with the default tensorrt version 8.2.x but no success

    Could someone please help? Thanks a lot!!

    opened by tak-ho 37
  • convert TensorRT model failed

    convert TensorRT model failed

    Describe the bug 使用代码调用模型转换,将pytorch模型转换为tensorRT模型时执行失败 Using code to call model conversion, the execution fails when converting a pytorch model to a tensorRT model

    运行流程为fastapi接收到模型转换请求 下发到huey队列 huey队列代码与deploy.py代码基本一致 The running process is that fastapi receives the model conversion request and sends it to the huey queue. The huey queue code is basically the same as the deploy.py code.

    当我放弃tensorRT转为onnx后 可以正常转换 但是在实例化Detector后 接口层被阻塞在那里 并没有往下执行 When I gave up tensorRT and switched to onnx, I could convert normally, but after instantiating the Detector, the interface layer was blocked there and did not go down. image

    也没有错误信息 除了几个config的info日志 并没有任何其他的反馈 There is no error message, except for a few config info logs and no other feedback

    Reproduction

    1. What command or script did you run? 我没有运行命令去执行模型转换 而是通过代码调起deploy代码去执行转换 实际上和执行命令转换没有什么区别 可以尝试以下列命令复现 I did not run the command to perform the model transformation, but invoked the deploy code to perform the transformation through the code. In fact, it is no different from executing the command transformation. You can try the following command to reproduce
    python tools/deploy.py \
        configs/mmdeploy/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py \
        /static/work_dirs/bcccd9e0-41a1-408d-9dfa-f4e634e9608c/yolox_l_8x8_300e_coco.py \
        /static/work_dirs/bcccd9e0-41a1-408d-9dfa-f4e634e9608c/best_bbox_mAP_epoch_149.pth \
        /dataset/car-damage-coco/images/val/1.jpg \
        --work-dir /static/work_dirs/bcccd9e0-41a1-408d-9dfa-f4e634e9608c \
        --device cuda:0 \
        --log-level DEBUG \
    
    1. Did you make any modifications on the code or config? Did you understand what you have modified?

    Environment

    2022-07-21 09:23:24,423 - mmdeploy - INFO - 
    
    2022-07-21 09:23:24,424 - mmdeploy - INFO - **********Environmental information**********
    2022-07-21 09:23:32,096 - mmdeploy - INFO - sys.platform: win32
    2022-07-21 09:23:32,096 - mmdeploy - INFO - Python: 3.9.0 (default, Nov 15 2020, 08:30:55) [MSC v.1916 64 bit (AMD64)]
    2022-07-21 09:23:32,096 - mmdeploy - INFO - CUDA available: True
    2022-07-21 09:23:32,096 - mmdeploy - INFO - GPU 0: NVIDIA GeForce GTX 1060 6GB
    2022-07-21 09:23:32,096 - mmdeploy - INFO - CUDA_HOME: D:\CUDAToolkit
    2022-07-21 09:23:32,096 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.109
    2022-07-21 09:23:32,096 - mmdeploy - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.29.30145 版
    2022-07-21 09:23:32,097 - mmdeploy - INFO - GCC: n/a
    2022-07-21 09:23:32,097 - mmdeploy - INFO - PyTorch: 1.11.0+cu113
      - CPU capability usage: AVX2
      - CUDA Runtime 11.3
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
      - CuDNN 8.2
      - Magma 2.5.4
      - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,
    
    2022-07-21 09:23:32,098 - mmdeploy - INFO - TorchVision: 0.12.0+cu113
    2022-07-21 09:23:32,098 - mmdeploy - INFO - OpenCV: 4.6.0
    2022-07-21 09:23:32,098 - mmdeploy - INFO - MMCV: 1.6.0
    2022-07-21 09:23:32,099 - mmdeploy - INFO - MMCV Compiler: MSVC 192930140
    2022-07-21 09:23:32,099 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
    2022-07-21 09:23:32,099 - mmdeploy - INFO - MMDeploy: 0.6.0+1d6437c
    2022-07-21 09:23:32,099 - mmdeploy - INFO -
    
    2022-07-21 09:23:32,099 - mmdeploy - INFO - **********Backend information**********
    2022-07-21 09:23:36,070 - mmdeploy - INFO - onnxruntime: 1.11.1 ops_is_avaliable : True
    2022-07-21 09:23:36,168 - mmdeploy - INFO - tensorrt: 8.4.1.5   ops_is_avaliable : True
    2022-07-21 09:23:36,208 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
    2022-07-21 09:23:36,210 - mmdeploy - INFO - pplnn_is_avaliable: False
    2022-07-21 09:23:36,212 - mmdeploy - INFO - openvino_is_avaliable: False
    2022-07-21 09:23:36,212 - mmdeploy - INFO - 
    
    2022-07-21 09:23:36,212 - mmdeploy - INFO - **********Codebase information**********
    2022-07-21 09:23:36,232 - mmdeploy - INFO - mmdet:      2.25.0
    2022-07-21 09:23:36,232 - mmdeploy - INFO - mmseg:      None
    2022-07-21 09:23:36,232 - mmdeploy - INFO - mmcls:      None
    2022-07-21 09:23:36,232 - mmdeploy - INFO - mmocr:      None
    2022-07-21 09:23:36,233 - mmdeploy - INFO - mmedit:     None
    2022-07-21 09:23:36,233 - mmdeploy - INFO - mmdet3d:    None
    2022-07-21 09:23:36,233 - mmdeploy - INFO - mmpose:     None
    2022-07-21 09:23:36,233 - mmdeploy - INFO - mmrotate:   None
    

    Error traceback 这是pytorch转tensorRT模型的日志 This is the log of pytorch to tensorRT model

    2022-07-21 09:57:35,423 - mmdeploy - INFO - 当前任务ID:bcccd9e0-41a1-408d-9dfa-f4e634e9608c
    Registry:{'input_size': (640, 640), 'random_size_range': (15, 25), 'random_size_interval': 10, 'backbone': {'type': 'CSPDarknet', 'deepen_factor': 1.0, 'widen_factor': 1.0}, 'neck': {'type': 'YOLOXPAFPN', 'in_channels': [256, 512, 1024], 'out_channels': 256, 'num_csp_blocks': 3}, 'bbox_head': {'type': 'YOLOXHead', 'num_classes': 5, 'in_channels': 256, 'feat_channels': 256}, 'train_cfg': None, 'test_cfg': {'score_thr':0.01, 'nms': {'type': 'nms', 'iou_threshold': 0.65}}}
    Registry:{'deepen_factor': 1.0, 'widen_factor': 1.0}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{'in_channels': [256, 512, 1024], 'out_channels': 256, 'num_csp_blocks': 3}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{'num_classes': 5, 'in_channels': 256, 'feat_channels': 256, 'train_cfg': None, 'test_cfg': {'score_thr': 0.01, 'nms': {'type': 'nms', 'iou_threshold': 0.65}}}
    Registry:{'use_sigmoid': True, 'reduction': 'sum', 'loss_weight': 1.0}
    Registry:{'mode': 'square', 'eps': 1e-16, 'reduction': 'sum', 'loss_weight': 5.0}
    Registry:{'use_sigmoid': True, 'reduction': 'sum', 'loss_weight': 1.0}
    Registry:{'reduction': 'sum', 'loss_weight': 1.0}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    Registry:{}
    load checkpoint from local path: D:\static\work_dirs\bcccd9e0-41a1-408d-9dfa-f4e634e9608c\best_bbox_mAP_epoch_149.pth
    The model and loaded state dict do not match exactly
    
    unexpected key in source state_dict: ema_backbone_stem_conv_conv_weight, ema_backbone_stem_conv_bn_weight, ema_backbone_stem_conv_bn_bias, ema_backbone_stem_conv_bn_running_mean, ema_backbone_stem_conv_bn_running_var, ema_backbone_stem_conv_bn_num_batches_tracked, ema_backbone_stage1_0_conv_weight, ema_backbone_stage1_0_bn_weight, ema_backbone_stage1_0_bn_bias, ema_backbone_stage1_0_bn_running_mean, ema_backbone_stage1
    _0_bn_running_var, ema_backbone_stage1_0_bn_num_batches_tracked, ema_backbone_stage1_1_main_conv_conv_weight, ema_backbone_stage1_1_main_conv_bn_weight, ema_backbone_stage1_1_main_conv_bn_bias, ema_backbone_stage1_1_main_conv_bn_running_mean, ema_backbone_stage1_1_main_conv_bn_running_var, ema_backbone_stage1_1_main_conv_bn_num_batches_tracked, ema_backbone_stage1_1_short_conv_conv_weight, ema_backbone_stage1_1_short_c
    onv_bn_weight, ema_backbone_stage1_1_short_conv_bn_bias, ema_backbone_stage1_1_short_conv_bn_running_mean, ema_backbone_stage1_1_short_conv_bn_running_var, ema_backbone_stage1_1_short_conv_bn_num_batches_tracked, ema_backbone_stage1_1_final_conv_conv_weight, ema_backbone_stage1_1_final_conv_bn_weight, ema_backbone_stage1_1_final_conv_bn_bias, ema_backbone_stage1_1_final_conv_bn_running_mean, ema_backbone_stage1_1_final
    _conv_bn_running_var, ema_backbone_stage1_1_final_conv_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_0_conv1_conv_weight, ema_backbone_stage1_1_blocks_0_conv1_bn_weight, ema_backbone_stage1_1_blocks_0_conv1_bn_bias, ema_backbone_stage1_1_blocks_0_conv1_bn_running_mean, ema_backbone_stage1_1_blocks_0_conv1_bn_running_var, ema_backbone_stage1_1_blocks_0_conv1_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_0_
    conv2_conv_weight, ema_backbone_stage1_1_blocks_0_conv2_bn_weight, ema_backbone_stage1_1_blocks_0_conv2_bn_bias, ema_backbone_stage1_1_blocks_0_conv2_bn_running_mean, ema_backbone_stage1_1_blocks_0_conv2_bn_running_var, ema_backbone_stage1_1_blocks_0_conv2_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_1_conv1_conv_weight, ema_backbone_stage1_1_blocks_1_conv1_bn_weight, ema_backbone_stage1_1_blocks_1_conv1_bn_bia
    s, ema_backbone_stage1_1_blocks_1_conv1_bn_running_mean, ema_backbone_stage1_1_blocks_1_conv1_bn_running_var, ema_backbone_stage1_1_blocks_1_conv1_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_1_conv2_conv_weight, ema_backbone_stage1_1_blocks_1_conv2_bn_weight, ema_backbone_stage1_1_blocks_1_conv2_bn_bias, ema_backbone_stage1_1_blocks_1_conv2_bn_running_mean, ema_backbone_stage1_1_blocks_1_conv2_bn_running_var, 
    ema_backbone_stage1_1_blocks_1_conv2_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_2_conv1_conv_weight, ema_backbone_stage1_1_blocks_2_conv1_bn_weight, ema_backbone_stage1_1_blocks_2_conv1_bn_bias, ema_backbone_stage1_1_blocks_2_conv1_bn_running_mean, ema_backbone_stage1_1_blocks_2_conv1_bn_running_var, ema_backbone_stage1_1_blocks_2_conv1_bn_num_batches_tracked, ema_backbone_stage1_1_blocks_2_conv2_conv_weight,
     ema_backbone_stage1_1_blocks_2_conv2_bn_weight, ema_backbone_stage1_1_blocks_2_conv2_bn_bias, ema_backbone_stage1_1_blocks_2_conv2_bn_running_mean, ema_backbone_stage1_1_blocks_2_conv2_bn_running_var, ema_backbone_stage1_1_blocks_2_conv2_bn_num_batches_tracked, ema_backbone_stage2_0_conv_weight, ema_backbone_stage2_0_bn_weight, ema_backbone_stage2_0_bn_bias, ema_backbone_stage2_0_bn_running_mean, ema_backbone_stage2_0
    _bn_running_var, ema_backbone_stage2_0_bn_num_batches_tracked, ema_backbone_stage2_1_main_conv_conv_weight, ema_backbone_stage2_1_main_conv_bn_weight, ema_backbone_stage2_1_main_conv_bn_bias, ema_backbone_stage2_1_main_conv_bn_running_mean, ema_backbone_stage2_1_main_conv_bn_running_var, ema_backbone_stage2_1_main_conv_bn_num_batches_tracked, ema_backbone_stage2_1_short_conv_conv_weight, ema_backbone_stage2_1_short_con
    v_bn_weight, ema_backbone_stage2_1_short_conv_bn_bias, ema_backbone_stage2_1_short_conv_bn_running_mean, ema_backbone_stage2_1_short_conv_bn_running_var, ema_backbone_stage2_1_short_conv_bn_num_batches_tracked, ema_backbone_stage2_1_final_conv_conv_weight, ema_backbone_stage2_1_final_conv_bn_weight, ema_backbone_stage2_1_final_conv_bn_bias, ema_backbone_stage2_1_final_conv_bn_running_mean, ema_backbone_stage2_1_final_c
    onv_bn_running_var, ema_backbone_stage2_1_final_conv_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_0_conv1_conv_weight, ema_backbone_stage2_1_blocks_0_conv1_bn_weight, ema_backbone_stage2_1_blocks_0_conv1_bn_bias, ema_backbone_stage2_1_blocks_0_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_0_conv1_bn_running_var, ema_backbone_stage2_1_blocks_0_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_0_co
    nv2_conv_weight, ema_backbone_stage2_1_blocks_0_conv2_bn_weight, ema_backbone_stage2_1_blocks_0_conv2_bn_bias, ema_backbone_stage2_1_blocks_0_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_0_conv2_bn_running_var, ema_backbone_stage2_1_blocks_0_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_1_conv1_conv_weight, ema_backbone_stage2_1_blocks_1_conv1_bn_weight, ema_backbone_stage2_1_blocks_1_conv1_bn_bias,
     ema_backbone_stage2_1_blocks_1_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_1_conv1_bn_running_var, ema_backbone_stage2_1_blocks_1_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_1_conv2_conv_weight, ema_backbone_stage2_1_blocks_1_conv2_bn_weight, ema_backbone_stage2_1_blocks_1_conv2_bn_bias, ema_backbone_stage2_1_blocks_1_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_1_conv2_bn_running_var, em
    a_backbone_stage2_1_blocks_1_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_2_conv1_conv_weight, ema_backbone_stage2_1_blocks_2_conv1_bn_weight, ema_backbone_stage2_1_blocks_2_conv1_bn_bias, ema_backbone_stage2_1_blocks_2_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_2_conv1_bn_running_var, ema_backbone_stage2_1_blocks_2_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_2_conv2_conv_weight, e
    ma_backbone_stage2_1_blocks_2_conv2_bn_weight, ema_backbone_stage2_1_blocks_2_conv2_bn_bias, ema_backbone_stage2_1_blocks_2_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_2_conv2_bn_running_var, ema_backbone_stage2_1_blocks_2_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_3_conv1_conv_weight, ema_backbone_stage2_1_blocks_3_conv1_bn_weight, ema_backbone_stage2_1_blocks_3_conv1_bn_bias, ema_backbone_stag
    e2_1_blocks_3_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_3_conv1_bn_running_var, ema_backbone_stage2_1_blocks_3_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_3_conv2_conv_weight, ema_backbone_stage2_1_blocks_3_conv2_bn_weight, ema_backbone_stage2_1_blocks_3_conv2_bn_bias, ema_backbone_stage2_1_blocks_3_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_3_conv2_bn_running_var, ema_backbone_stage2_
    1_blocks_3_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_4_conv1_conv_weight, ema_backbone_stage2_1_blocks_4_conv1_bn_weight, ema_backbone_stage2_1_blocks_4_conv1_bn_bias, ema_backbone_stage2_1_blocks_4_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_4_conv1_bn_running_var, ema_backbone_stage2_1_blocks_4_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_4_conv2_conv_weight, ema_backbone_stage2
    _1_blocks_4_conv2_bn_weight, ema_backbone_stage2_1_blocks_4_conv2_bn_bias, ema_backbone_stage2_1_blocks_4_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_4_conv2_bn_running_var, ema_backbone_stage2_1_blocks_4_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_5_conv1_conv_weight, ema_backbone_stage2_1_blocks_5_conv1_bn_weight, ema_backbone_stage2_1_blocks_5_conv1_bn_bias, ema_backbone_stage2_1_blocks_5_conv
    1_bn_running_mean, ema_backbone_stage2_1_blocks_5_conv1_bn_running_var, ema_backbone_stage2_1_blocks_5_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_5_conv2_conv_weight, ema_backbone_stage2_1_blocks_5_conv2_bn_weight, ema_backbone_stage2_1_blocks_5_conv2_bn_bias, ema_backbone_stage2_1_blocks_5_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_5_conv2_bn_running_var, ema_backbone_stage2_1_blocks_5_conv2_b
    n_num_batches_tracked, ema_backbone_stage2_1_blocks_6_conv1_conv_weight, ema_backbone_stage2_1_blocks_6_conv1_bn_weight, ema_backbone_stage2_1_blocks_6_conv1_bn_bias, ema_backbone_stage2_1_blocks_6_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_6_conv1_bn_running_var, ema_backbone_stage2_1_blocks_6_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_6_conv2_conv_weight, ema_backbone_stage2_1_blocks_6_conv2_
    bn_weight, ema_backbone_stage2_1_blocks_6_conv2_bn_bias, ema_backbone_stage2_1_blocks_6_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_6_conv2_bn_running_var, ema_backbone_stage2_1_blocks_6_conv2_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_7_conv1_conv_weight, ema_backbone_stage2_1_blocks_7_conv1_bn_weight, ema_backbone_stage2_1_blocks_7_conv1_bn_bias, ema_backbone_stage2_1_blocks_7_conv1_bn_running_mean,
     ema_backbone_stage2_1_blocks_7_conv1_bn_running_var, ema_backbone_stage2_1_blocks_7_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_7_conv2_conv_weight, ema_backbone_stage2_1_blocks_7_conv2_bn_weight, ema_backbone_stage2_1_blocks_7_conv2_bn_bias, ema_backbone_stage2_1_blocks_7_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_7_conv2_bn_running_var, ema_backbone_stage2_1_blocks_7_conv2_bn_num_batches_trac
    ked, ema_backbone_stage2_1_blocks_8_conv1_conv_weight, ema_backbone_stage2_1_blocks_8_conv1_bn_weight, ema_backbone_stage2_1_blocks_8_conv1_bn_bias, ema_backbone_stage2_1_blocks_8_conv1_bn_running_mean, ema_backbone_stage2_1_blocks_8_conv1_bn_running_var, ema_backbone_stage2_1_blocks_8_conv1_bn_num_batches_tracked, ema_backbone_stage2_1_blocks_8_conv2_conv_weight, ema_backbone_stage2_1_blocks_8_conv2_bn_weight, ema_bac
    kbone_stage2_1_blocks_8_conv2_bn_bias, ema_backbone_stage2_1_blocks_8_conv2_bn_running_mean, ema_backbone_stage2_1_blocks_8_conv2_bn_running_var, ema_backbone_stage2_1_blocks_8_conv2_bn_num_batches_tracked, ema_backbone_stage3_0_conv_weight, ema_backbone_stage3_0_bn_weight, ema_backbone_stage3_0_bn_bias, ema_backbone_stage3_0_bn_running_mean, ema_backbone_stage3_0_bn_running_var, ema_backbone_stage3_0_bn_num_batches_tr
    acked, ema_backbone_stage3_1_main_conv_conv_weight, ema_backbone_stage3_1_main_conv_bn_weight, ema_backbone_stage3_1_main_conv_bn_bias, ema_backbone_stage3_1_main_conv_bn_running_mean, ema_backbone_stage3_1_main_conv_bn_running_var, ema_backbone_stage3_1_main_conv_bn_num_batches_tracked, ema_backbone_stage3_1_short_conv_conv_weight, ema_backbone_stage3_1_short_conv_bn_weight, ema_backbone_stage3_1_short_conv_bn_bias, e
    ma_backbone_stage3_1_short_conv_bn_running_mean, ema_backbone_stage3_1_short_conv_bn_running_var, ema_backbone_stage3_1_short_conv_bn_num_batches_tracked, ema_backbone_stage3_1_final_conv_conv_weight, ema_backbone_stage3_1_final_conv_bn_weight, ema_backbone_stage3_1_final_conv_bn_bias, ema_backbone_stage3_1_final_conv_bn_running_mean, ema_backbone_stage3_1_final_conv_bn_running_var, ema_backbone_stage3_1_final_conv_bn_
    num_batches_tracked, ema_backbone_stage3_1_blocks_0_conv1_conv_weight, ema_backbone_stage3_1_blocks_0_conv1_bn_weight, ema_backbone_stage3_1_blocks_0_conv1_bn_bias, ema_backbone_stage3_1_blocks_0_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_0_conv1_bn_running_var, ema_backbone_stage3_1_blocks_0_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_0_conv2_conv_weight, ema_backbone_stage3_1_blocks_0_conv2_bn
    _weight, ema_backbone_stage3_1_blocks_0_conv2_bn_bias, ema_backbone_stage3_1_blocks_0_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_0_conv2_bn_running_var, ema_backbone_stage3_1_blocks_0_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_1_conv1_conv_weight, ema_backbone_stage3_1_blocks_1_conv1_bn_weight, ema_backbone_stage3_1_blocks_1_conv1_bn_bias, ema_backbone_stage3_1_blocks_1_conv1_bn_running_mean, e
    ma_backbone_stage3_1_blocks_1_conv1_bn_running_var, ema_backbone_stage3_1_blocks_1_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_1_conv2_conv_weight, ema_backbone_stage3_1_blocks_1_conv2_bn_weight, ema_backbone_stage3_1_blocks_1_conv2_bn_bias, ema_backbone_stage3_1_blocks_1_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_1_conv2_bn_running_var, ema_backbone_stage3_1_blocks_1_conv2_bn_num_batches_tracke
    d, ema_backbone_stage3_1_blocks_2_conv1_conv_weight, ema_backbone_stage3_1_blocks_2_conv1_bn_weight, ema_backbone_stage3_1_blocks_2_conv1_bn_bias, ema_backbone_stage3_1_blocks_2_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_2_conv1_bn_running_var, ema_backbone_stage3_1_blocks_2_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_2_conv2_conv_weight, ema_backbone_stage3_1_blocks_2_conv2_bn_weight, ema_backb
    one_stage3_1_blocks_2_conv2_bn_bias, ema_backbone_stage3_1_blocks_2_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_2_conv2_bn_running_var, ema_backbone_stage3_1_blocks_2_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_3_conv1_conv_weight, ema_backbone_stage3_1_blocks_3_conv1_bn_weight, ema_backbone_stage3_1_blocks_3_conv1_bn_bias, ema_backbone_stage3_1_blocks_3_conv1_bn_running_mean, ema_backbone_stage3
    _1_blocks_3_conv1_bn_running_var, ema_backbone_stage3_1_blocks_3_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_3_conv2_conv_weight, ema_backbone_stage3_1_blocks_3_conv2_bn_weight, ema_backbone_stage3_1_blocks_3_conv2_bn_bias, ema_backbone_stage3_1_blocks_3_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_3_conv2_bn_running_var, ema_backbone_stage3_1_blocks_3_conv2_bn_num_batches_tracked, ema_backbone_st
    age3_1_blocks_4_conv1_conv_weight, ema_backbone_stage3_1_blocks_4_conv1_bn_weight, ema_backbone_stage3_1_blocks_4_conv1_bn_bias, ema_backbone_stage3_1_blocks_4_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_4_conv1_bn_running_var, ema_backbone_stage3_1_blocks_4_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_4_conv2_conv_weight, ema_backbone_stage3_1_blocks_4_conv2_bn_weight, ema_backbone_stage3_1_block
    s_4_conv2_bn_bias, ema_backbone_stage3_1_blocks_4_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_4_conv2_bn_running_var, ema_backbone_stage3_1_blocks_4_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_5_conv1_conv_weight, ema_backbone_stage3_1_blocks_5_conv1_bn_weight, ema_backbone_stage3_1_blocks_5_conv1_bn_bias, ema_backbone_stage3_1_blocks_5_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_5_conv1_
    bn_running_var, ema_backbone_stage3_1_blocks_5_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_5_conv2_conv_weight, ema_backbone_stage3_1_blocks_5_conv2_bn_weight, ema_backbone_stage3_1_blocks_5_conv2_bn_bias, ema_backbone_stage3_1_blocks_5_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_5_conv2_bn_running_var, ema_backbone_stage3_1_blocks_5_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_6_co
    nv1_conv_weight, ema_backbone_stage3_1_blocks_6_conv1_bn_weight, ema_backbone_stage3_1_blocks_6_conv1_bn_bias, ema_backbone_stage3_1_blocks_6_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_6_conv1_bn_running_var, ema_backbone_stage3_1_blocks_6_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_6_conv2_conv_weight, ema_backbone_stage3_1_blocks_6_conv2_bn_weight, ema_backbone_stage3_1_blocks_6_conv2_bn_bias,
     ema_backbone_stage3_1_blocks_6_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_6_conv2_bn_running_var, ema_backbone_stage3_1_blocks_6_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_7_conv1_conv_weight, ema_backbone_stage3_1_blocks_7_conv1_bn_weight, ema_backbone_stage3_1_blocks_7_conv1_bn_bias, ema_backbone_stage3_1_blocks_7_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_7_conv1_bn_running_var, em
    a_backbone_stage3_1_blocks_7_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_7_conv2_conv_weight, ema_backbone_stage3_1_blocks_7_conv2_bn_weight, ema_backbone_stage3_1_blocks_7_conv2_bn_bias, ema_backbone_stage3_1_blocks_7_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_7_conv2_bn_running_var, ema_backbone_stage3_1_blocks_7_conv2_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_8_conv1_conv_weight, e
    ma_backbone_stage3_1_blocks_8_conv1_bn_weight, ema_backbone_stage3_1_blocks_8_conv1_bn_bias, ema_backbone_stage3_1_blocks_8_conv1_bn_running_mean, ema_backbone_stage3_1_blocks_8_conv1_bn_running_var, ema_backbone_stage3_1_blocks_8_conv1_bn_num_batches_tracked, ema_backbone_stage3_1_blocks_8_conv2_conv_weight, ema_backbone_stage3_1_blocks_8_conv2_bn_weight, ema_backbone_stage3_1_blocks_8_conv2_bn_bias, ema_backbone_stag
    e3_1_blocks_8_conv2_bn_running_mean, ema_backbone_stage3_1_blocks_8_conv2_bn_running_var, ema_backbone_stage3_1_blocks_8_conv2_bn_num_batches_tracked, ema_backbone_stage4_0_conv_weight, ema_backbone_stage4_0_bn_weight, ema_backbone_stage4_0_bn_bias, ema_backbone_stage4_0_bn_running_mean, ema_backbone_stage4_0_bn_running_var, ema_backbone_stage4_0_bn_num_batches_tracked, ema_backbone_stage4_1_conv1_conv_weight, ema_back
    bone_stage4_1_conv1_bn_weight, ema_backbone_stage4_1_conv1_bn_bias, ema_backbone_stage4_1_conv1_bn_running_mean, ema_backbone_stage4_1_conv1_bn_running_var, ema_backbone_stage4_1_conv1_bn_num_batches_tracked, ema_backbone_stage4_1_conv2_conv_weight, ema_backbone_stage4_1_conv2_bn_weight, ema_backbone_stage4_1_conv2_bn_bias, ema_backbone_stage4_1_conv2_bn_running_mean, ema_backbone_stage4_1_conv2_bn_running_var, ema_bac
    kbone_stage4_1_conv2_bn_num_batches_tracked, ema_backbone_stage4_2_main_conv_conv_weight, ema_backbone_stage4_2_main_conv_bn_weight, ema_backbone_stage4_2_main_conv_bn_bias, ema_backbone_stage4_2_main_conv_bn_running_mean, ema_backbone_stage4_2_main_conv_bn_running_var, ema_backbone_stage4_2_main_conv_bn_num_batches_tracked, ema_backbone_stage4_2_short_conv_conv_weight, ema_backbone_stage4_2_short_conv_bn_weight, ema_b
    ackbone_stage4_2_short_conv_bn_bias, ema_backbone_stage4_2_short_conv_bn_running_mean, ema_backbone_stage4_2_short_conv_bn_running_var, ema_backbone_stage4_2_short_conv_bn_num_batches_tracked, ema_backbone_stage4_2_final_conv_conv_weight, ema_backbone_stage4_2_final_conv_bn_weight, ema_backbone_stage4_2_final_conv_bn_bias, ema_backbone_stage4_2_final_conv_bn_running_mean, ema_backbone_stage4_2_final_conv_bn_running_var
    , ema_backbone_stage4_2_final_conv_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_0_conv1_conv_weight, ema_backbone_stage4_2_blocks_0_conv1_bn_weight, ema_backbone_stage4_2_blocks_0_conv1_bn_bias, ema_backbone_stage4_2_blocks_0_conv1_bn_running_mean, ema_backbone_stage4_2_blocks_0_conv1_bn_running_var, ema_backbone_stage4_2_blocks_0_conv1_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_0_conv2_conv_weight, e
    ma_backbone_stage4_2_blocks_0_conv2_bn_weight, ema_backbone_stage4_2_blocks_0_conv2_bn_bias, ema_backbone_stage4_2_blocks_0_conv2_bn_running_mean, ema_backbone_stage4_2_blocks_0_conv2_bn_running_var, ema_backbone_stage4_2_blocks_0_conv2_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_1_conv1_conv_weight, ema_backbone_stage4_2_blocks_1_conv1_bn_weight, ema_backbone_stage4_2_blocks_1_conv1_bn_bias, ema_backbone_stag
    e4_2_blocks_1_conv1_bn_running_mean, ema_backbone_stage4_2_blocks_1_conv1_bn_running_var, ema_backbone_stage4_2_blocks_1_conv1_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_1_conv2_conv_weight, ema_backbone_stage4_2_blocks_1_conv2_bn_weight, ema_backbone_stage4_2_blocks_1_conv2_bn_bias, ema_backbone_stage4_2_blocks_1_conv2_bn_running_mean, ema_backbone_stage4_2_blocks_1_conv2_bn_running_var, ema_backbone_stage4_
    2_blocks_1_conv2_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_2_conv1_conv_weight, ema_backbone_stage4_2_blocks_2_conv1_bn_weight, ema_backbone_stage4_2_blocks_2_conv1_bn_bias, ema_backbone_stage4_2_blocks_2_conv1_bn_running_mean, ema_backbone_stage4_2_blocks_2_conv1_bn_running_var, ema_backbone_stage4_2_blocks_2_conv1_bn_num_batches_tracked, ema_backbone_stage4_2_blocks_2_conv2_conv_weight, ema_backbone_stage4
    _2_blocks_2_conv2_bn_weight, ema_backbone_stage4_2_blocks_2_conv2_bn_bias, ema_backbone_stage4_2_blocks_2_conv2_bn_running_mean, ema_backbone_stage4_2_blocks_2_conv2_bn_running_var, ema_backbone_stage4_2_blocks_2_conv2_bn_num_batches_tracked, ema_neck_reduce_layers_0_conv_weight, ema_neck_reduce_layers_0_bn_weight, ema_neck_reduce_layers_0_bn_bias, ema_neck_reduce_layers_0_bn_running_mean, ema_neck_reduce_layers_0_bn_r
    unning_var, ema_neck_reduce_layers_0_bn_num_batches_tracked, ema_neck_reduce_layers_1_conv_weight, ema_neck_reduce_layers_1_bn_weight, ema_neck_reduce_layers_1_bn_bias, ema_neck_reduce_layers_1_bn_running_mean, ema_neck_reduce_layers_1_bn_running_var, ema_neck_reduce_layers_1_bn_num_batches_tracked, ema_neck_top_down_blocks_0_main_conv_conv_weight, ema_neck_top_down_blocks_0_main_conv_bn_weight, ema_neck_top_down_block
    s_0_main_conv_bn_bias, ema_neck_top_down_blocks_0_main_conv_bn_running_mean, ema_neck_top_down_blocks_0_main_conv_bn_running_var, ema_neck_top_down_blocks_0_main_conv_bn_num_batches_tracked, ema_neck_top_down_blocks_0_short_conv_conv_weight, ema_neck_top_down_blocks_0_short_conv_bn_weight, ema_neck_top_down_blocks_0_short_conv_bn_bias, ema_neck_top_down_blocks_0_short_conv_bn_running_mean, ema_neck_top_down_blocks_0_sh
    ort_conv_bn_running_var, ema_neck_top_down_blocks_0_short_conv_bn_num_batches_tracked, ema_neck_top_down_blocks_0_final_conv_conv_weight, ema_neck_top_down_blocks_0_final_conv_bn_weight, ema_neck_top_down_blocks_0_final_conv_bn_bias, ema_neck_top_down_blocks_0_final_conv_bn_running_mean, ema_neck_top_down_blocks_0_final_conv_bn_running_var, ema_neck_top_down_blocks_0_final_conv_bn_num_batches_tracked, ema_neck_top_down
    _blocks_0_blocks_0_conv1_conv_weight, ema_neck_top_down_blocks_0_blocks_0_conv1_bn_weight, ema_neck_top_down_blocks_0_blocks_0_conv1_bn_bias, ema_neck_top_down_blocks_0_blocks_0_conv1_bn_running_mean, ema_neck_top_down_blocks_0_blocks_0_conv1_bn_running_var, ema_neck_top_down_blocks_0_blocks_0_conv1_bn_num_batches_tracked, ema_neck_top_down_blocks_0_blocks_0_conv2_conv_weight, ema_neck_top_down_blocks_0_blocks_0_conv2_
    bn_weight, ema_neck_top_down_blocks_0_blocks_0_conv2_bn_bias, ema_neck_top_down_blocks_0_blocks_0_conv2_bn_running_mean, ema_neck_top_down_blocks_0_blocks_0_conv2_bn_running_var, ema_neck_top_down_blocks_0_blocks_0_conv2_bn_num_batches_tracked, ema_neck_top_down_blocks_0_blocks_1_conv1_conv_weight, ema_neck_top_down_blocks_0_blocks_1_conv1_bn_weight, ema_neck_top_down_blocks_0_blocks_1_conv1_bn_bias, ema_neck_top_down_
    blocks_0_blocks_1_conv1_bn_running_mean, ema_neck_top_down_blocks_0_blocks_1_conv1_bn_running_var, ema_neck_top_down_blocks_0_blocks_1_conv1_bn_num_batches_tracked, ema_neck_top_down_blocks_0_blocks_1_conv2_conv_weight, ema_neck_top_down_blocks_0_blocks_1_conv2_bn_weight, ema_neck_top_down_blocks_0_blocks_1_conv2_bn_bias, ema_neck_top_down_blocks_0_blocks_1_conv2_bn_running_mean, ema_neck_top_down_blocks_0_blocks_1_con
    v2_bn_running_var, ema_neck_top_down_blocks_0_blocks_1_conv2_bn_num_batches_tracked, ema_neck_top_down_blocks_0_blocks_2_conv1_conv_weight, ema_neck_top_down_blocks_0_blocks_2_conv1_bn_weight, ema_neck_top_down_blocks_0_blocks_2_conv1_bn_bias, ema_neck_top_down_blocks_0_blocks_2_conv1_bn_running_mean, ema_neck_top_down_blocks_0_blocks_2_conv1_bn_running_var, ema_neck_top_down_blocks_0_blocks_2_conv1_bn_num_batches_trac
    ked, ema_neck_top_down_blocks_0_blocks_2_conv2_conv_weight, ema_neck_top_down_blocks_0_blocks_2_conv2_bn_weight, ema_neck_top_down_blocks_0_blocks_2_conv2_bn_bias, ema_neck_top_down_blocks_0_blocks_2_conv2_bn_running_mean, ema_neck_top_down_blocks_0_blocks_2_conv2_bn_running_var, ema_neck_top_down_blocks_0_blocks_2_conv2_bn_num_batches_tracked, ema_neck_top_down_blocks_1_main_conv_conv_weight, ema_neck_top_down_blocks_
    1_main_conv_bn_weight, ema_neck_top_down_blocks_1_main_conv_bn_bias, ema_neck_top_down_blocks_1_main_conv_bn_running_mean, ema_neck_top_down_blocks_1_main_conv_bn_running_var, ema_neck_top_down_blocks_1_main_conv_bn_num_batches_tracked, ema_neck_top_down_blocks_1_short_conv_conv_weight, ema_neck_top_down_blocks_1_short_conv_bn_weight, ema_neck_top_down_blocks_1_short_conv_bn_bias, ema_neck_top_down_blocks_1_short_conv_
    bn_running_mean, ema_neck_top_down_blocks_1_short_conv_bn_running_var, ema_neck_top_down_blocks_1_short_conv_bn_num_batches_tracked, ema_neck_top_down_blocks_1_final_conv_conv_weight, ema_neck_top_down_blocks_1_final_conv_bn_weight, ema_neck_top_down_blocks_1_final_conv_bn_bias, ema_neck_top_down_blocks_1_final_conv_bn_running_mean, ema_neck_top_down_blocks_1_final_conv_bn_running_var, ema_neck_top_down_blocks_1_final_
    conv_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_0_conv1_conv_weight, ema_neck_top_down_blocks_1_blocks_0_conv1_bn_weight, ema_neck_top_down_blocks_1_blocks_0_conv1_bn_bias, ema_neck_top_down_blocks_1_blocks_0_conv1_bn_running_mean, ema_neck_top_down_blocks_1_blocks_0_conv1_bn_running_var, ema_neck_top_down_blocks_1_blocks_0_conv1_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_0_conv2_conv_weig
    ht, ema_neck_top_down_blocks_1_blocks_0_conv2_bn_weight, ema_neck_top_down_blocks_1_blocks_0_conv2_bn_bias, ema_neck_top_down_blocks_1_blocks_0_conv2_bn_running_mean, ema_neck_top_down_blocks_1_blocks_0_conv2_bn_running_var, ema_neck_top_down_blocks_1_blocks_0_conv2_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_1_conv1_conv_weight, ema_neck_top_down_blocks_1_blocks_1_conv1_bn_weight, ema_neck_top_down_block
    s_1_blocks_1_conv1_bn_bias, ema_neck_top_down_blocks_1_blocks_1_conv1_bn_running_mean, ema_neck_top_down_blocks_1_blocks_1_conv1_bn_running_var, ema_neck_top_down_blocks_1_blocks_1_conv1_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_1_conv2_conv_weight, ema_neck_top_down_blocks_1_blocks_1_conv2_bn_weight, ema_neck_top_down_blocks_1_blocks_1_conv2_bn_bias, ema_neck_top_down_blocks_1_blocks_1_conv2_bn_running
    _mean, ema_neck_top_down_blocks_1_blocks_1_conv2_bn_running_var, ema_neck_top_down_blocks_1_blocks_1_conv2_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_2_conv1_conv_weight, ema_neck_top_down_blocks_1_blocks_2_conv1_bn_weight, ema_neck_top_down_blocks_1_blocks_2_conv1_bn_bias, ema_neck_top_down_blocks_1_blocks_2_conv1_bn_running_mean, ema_neck_top_down_blocks_1_blocks_2_conv1_bn_running_var, ema_neck_top_do
    wn_blocks_1_blocks_2_conv1_bn_num_batches_tracked, ema_neck_top_down_blocks_1_blocks_2_conv2_conv_weight, ema_neck_top_down_blocks_1_blocks_2_conv2_bn_weight, ema_neck_top_down_blocks_1_blocks_2_conv2_bn_bias, ema_neck_top_down_blocks_1_blocks_2_conv2_bn_running_mean, ema_neck_top_down_blocks_1_blocks_2_conv2_bn_running_var, ema_neck_top_down_blocks_1_blocks_2_conv2_bn_num_batches_tracked, ema_neck_downsamples_0_conv_w
    eight, ema_neck_downsamples_0_bn_weight, ema_neck_downsamples_0_bn_bias, ema_neck_downsamples_0_bn_running_mean, ema_neck_downsamples_0_bn_running_var, ema_neck_downsamples_0_bn_num_batches_tracked, ema_neck_downsamples_1_conv_weight, ema_neck_downsamples_1_bn_weight, ema_neck_downsamples_1_bn_bias, ema_neck_downsamples_1_bn_running_mean, ema_neck_downsamples_1_bn_running_var, ema_neck_downsamples_1_bn_num_batches_trac
    ked, ema_neck_bottom_up_blocks_0_main_conv_conv_weight, ema_neck_bottom_up_blocks_0_main_conv_bn_weight, ema_neck_bottom_up_blocks_0_main_conv_bn_bias, ema_neck_bottom_up_blocks_0_main_conv_bn_running_mean, ema_neck_bottom_up_blocks_0_main_conv_bn_running_var, ema_neck_bottom_up_blocks_0_main_conv_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_short_conv_conv_weight, ema_neck_bottom_up_blocks_0_short_conv_bn_weigh
    t, ema_neck_bottom_up_blocks_0_short_conv_bn_bias, ema_neck_bottom_up_blocks_0_short_conv_bn_running_mean, ema_neck_bottom_up_blocks_0_short_conv_bn_running_var, ema_neck_bottom_up_blocks_0_short_conv_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_final_conv_conv_weight, ema_neck_bottom_up_blocks_0_final_conv_bn_weight, ema_neck_bottom_up_blocks_0_final_conv_bn_bias, ema_neck_bottom_up_blocks_0_final_conv_bn_runni
    ng_mean, ema_neck_bottom_up_blocks_0_final_conv_bn_running_var, ema_neck_bottom_up_blocks_0_final_conv_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_blocks_0_conv1_conv_weight, ema_neck_bottom_up_blocks_0_blocks_0_conv1_bn_weight, ema_neck_bottom_up_blocks_0_blocks_0_conv1_bn_bias, ema_neck_bottom_up_blocks_0_blocks_0_conv1_bn_running_mean, ema_neck_bottom_up_blocks_0_blocks_0_conv1_bn_running_var, ema_neck_botto
    m_up_blocks_0_blocks_0_conv1_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_blocks_0_conv2_conv_weight, ema_neck_bottom_up_blocks_0_blocks_0_conv2_bn_weight, ema_neck_bottom_up_blocks_0_blocks_0_conv2_bn_bias, ema_neck_bottom_up_blocks_0_blocks_0_conv2_bn_running_mean, ema_neck_bottom_up_blocks_0_blocks_0_conv2_bn_running_var, ema_neck_bottom_up_blocks_0_blocks_0_conv2_bn_num_batches_tracked, ema_neck_bottom_up_bl
    ocks_0_blocks_1_conv1_conv_weight, ema_neck_bottom_up_blocks_0_blocks_1_conv1_bn_weight, ema_neck_bottom_up_blocks_0_blocks_1_conv1_bn_bias, ema_neck_bottom_up_blocks_0_blocks_1_conv1_bn_running_mean, ema_neck_bottom_up_blocks_0_blocks_1_conv1_bn_running_var, ema_neck_bottom_up_blocks_0_blocks_1_conv1_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_blocks_1_conv2_conv_weight, ema_neck_bottom_up_blocks_0_blocks_1_co
    nv2_bn_weight, ema_neck_bottom_up_blocks_0_blocks_1_conv2_bn_bias, ema_neck_bottom_up_blocks_0_blocks_1_conv2_bn_running_mean, ema_neck_bottom_up_blocks_0_blocks_1_conv2_bn_running_var, ema_neck_bottom_up_blocks_0_blocks_1_conv2_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_blocks_2_conv1_conv_weight, ema_neck_bottom_up_blocks_0_blocks_2_conv1_bn_weight, ema_neck_bottom_up_blocks_0_blocks_2_conv1_bn_bias, ema_nec
    k_bottom_up_blocks_0_blocks_2_conv1_bn_running_mean, ema_neck_bottom_up_blocks_0_blocks_2_conv1_bn_running_var, ema_neck_bottom_up_blocks_0_blocks_2_conv1_bn_num_batches_tracked, ema_neck_bottom_up_blocks_0_blocks_2_conv2_conv_weight, ema_neck_bottom_up_blocks_0_blocks_2_conv2_bn_weight, ema_neck_bottom_up_blocks_0_blocks_2_conv2_bn_bias, ema_neck_bottom_up_blocks_0_blocks_2_conv2_bn_running_mean, ema_neck_bottom_up_bl
    ocks_0_blocks_2_conv2_bn_running_var, ema_neck_bottom_up_blocks_0_blocks_2_conv2_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_main_conv_conv_weight, ema_neck_bottom_up_blocks_1_main_conv_bn_weight, ema_neck_bottom_up_blocks_1_main_conv_bn_bias, ema_neck_bottom_up_blocks_1_main_conv_bn_running_mean, ema_neck_bottom_up_blocks_1_main_conv_bn_running_var, ema_neck_bottom_up_blocks_1_main_conv_bn_num_batches_tracked,
     ema_neck_bottom_up_blocks_1_short_conv_conv_weight, ema_neck_bottom_up_blocks_1_short_conv_bn_weight, ema_neck_bottom_up_blocks_1_short_conv_bn_bias, ema_neck_bottom_up_blocks_1_short_conv_bn_running_mean, ema_neck_bottom_up_blocks_1_short_conv_bn_running_var, ema_neck_bottom_up_blocks_1_short_conv_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_final_conv_conv_weight, ema_neck_bottom_up_blocks_1_final_conv_bn_wei
    ght, ema_neck_bottom_up_blocks_1_final_conv_bn_bias, ema_neck_bottom_up_blocks_1_final_conv_bn_running_mean, ema_neck_bottom_up_blocks_1_final_conv_bn_running_var, ema_neck_bottom_up_blocks_1_final_conv_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_blocks_0_conv1_conv_weight, ema_neck_bottom_up_blocks_1_blocks_0_conv1_bn_weight, ema_neck_bottom_up_blocks_1_blocks_0_conv1_bn_bias, ema_neck_bottom_up_blocks_1_block
    s_0_conv1_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_0_conv1_bn_running_var, ema_neck_bottom_up_blocks_1_blocks_0_conv1_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_blocks_0_conv2_conv_weight, ema_neck_bottom_up_blocks_1_blocks_0_conv2_bn_weight, ema_neck_bottom_up_blocks_1_blocks_0_conv2_bn_bias, ema_neck_bottom_up_blocks_1_blocks_0_conv2_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_0_conv2_bn_r
    unning_var, ema_neck_bottom_up_blocks_1_blocks_0_conv2_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_blocks_1_conv1_conv_weight, ema_neck_bottom_up_blocks_1_blocks_1_conv1_bn_weight, ema_neck_bottom_up_blocks_1_blocks_1_conv1_bn_bias, ema_neck_bottom_up_blocks_1_blocks_1_conv1_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_1_conv1_bn_running_var, ema_neck_bottom_up_blocks_1_blocks_1_conv1_bn_num_batches_trac
    ked, ema_neck_bottom_up_blocks_1_blocks_1_conv2_conv_weight, ema_neck_bottom_up_blocks_1_blocks_1_conv2_bn_weight, ema_neck_bottom_up_blocks_1_blocks_1_conv2_bn_bias, ema_neck_bottom_up_blocks_1_blocks_1_conv2_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_1_conv2_bn_running_var, ema_neck_bottom_up_blocks_1_blocks_1_conv2_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_blocks_2_conv1_conv_weight, ema_neck_bott
    om_up_blocks_1_blocks_2_conv1_bn_weight, ema_neck_bottom_up_blocks_1_blocks_2_conv1_bn_bias, ema_neck_bottom_up_blocks_1_blocks_2_conv1_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_2_conv1_bn_running_var, ema_neck_bottom_up_blocks_1_blocks_2_conv1_bn_num_batches_tracked, ema_neck_bottom_up_blocks_1_blocks_2_conv2_conv_weight, ema_neck_bottom_up_blocks_1_blocks_2_conv2_bn_weight, ema_neck_bottom_up_blocks_1_block
    s_2_conv2_bn_bias, ema_neck_bottom_up_blocks_1_blocks_2_conv2_bn_running_mean, ema_neck_bottom_up_blocks_1_blocks_2_conv2_bn_running_var, ema_neck_bottom_up_blocks_1_blocks_2_conv2_bn_num_batches_tracked, ema_neck_out_convs_0_conv_weight, ema_neck_out_convs_0_bn_weight, ema_neck_out_convs_0_bn_bias, ema_neck_out_convs_0_bn_running_mean, ema_neck_out_convs_0_bn_running_var, ema_neck_out_convs_0_bn_num_batches_tracked, e
    ma_neck_out_convs_1_conv_weight, ema_neck_out_convs_1_bn_weight, ema_neck_out_convs_1_bn_bias, ema_neck_out_convs_1_bn_running_mean, ema_neck_out_convs_1_bn_running_var, ema_neck_out_convs_1_bn_num_batches_tracked, ema_neck_out_convs_2_conv_weight, ema_neck_out_convs_2_bn_weight, ema_neck_out_convs_2_bn_bias, ema_neck_out_convs_2_bn_running_mean, ema_neck_out_convs_2_bn_running_var, ema_neck_out_convs_2_bn_num_batches_
    tracked, ema_bbox_head_multi_level_cls_convs_0_0_conv_weight, ema_bbox_head_multi_level_cls_convs_0_0_bn_weight, ema_bbox_head_multi_level_cls_convs_0_0_bn_bias, ema_bbox_head_multi_level_cls_convs_0_0_bn_running_mean, ema_bbox_head_multi_level_cls_convs_0_0_bn_running_var, ema_bbox_head_multi_level_cls_convs_0_0_bn_num_batches_tracked, ema_bbox_head_multi_level_cls_convs_0_1_conv_weight, ema_bbox_head_multi_level_cls_
    convs_0_1_bn_weight, ema_bbox_head_multi_level_cls_convs_0_1_bn_bias, ema_bbox_head_multi_level_cls_convs_0_1_bn_running_mean, ema_bbox_head_multi_level_cls_convs_0_1_bn_running_var, ema_bbox_head_multi_level_cls_convs_0_1_bn_num_batches_tracked, ema_bbox_head_multi_level_cls_convs_1_0_conv_weight, ema_bbox_head_multi_level_cls_convs_1_0_bn_weight, ema_bbox_head_multi_level_cls_convs_1_0_bn_bias, ema_bbox_head_multi_le
    vel_cls_convs_1_0_bn_running_mean, ema_bbox_head_multi_level_cls_convs_1_0_bn_running_var, ema_bbox_head_multi_level_cls_convs_1_0_bn_num_batches_tracked, ema_bbox_head_multi_level_cls_convs_1_1_conv_weight, ema_bbox_head_multi_level_cls_convs_1_1_bn_weight, ema_bbox_head_multi_level_cls_convs_1_1_bn_bias, ema_bbox_head_multi_level_cls_convs_1_1_bn_running_mean, ema_bbox_head_multi_level_cls_convs_1_1_bn_running_var, e
    ma_bbox_head_multi_level_cls_convs_1_1_bn_num_batches_tracked, ema_bbox_head_multi_level_cls_convs_2_0_conv_weight, ema_bbox_head_multi_level_cls_convs_2_0_bn_weight, ema_bbox_head_multi_level_cls_convs_2_0_bn_bias, ema_bbox_head_multi_level_cls_convs_2_0_bn_running_mean, ema_bbox_head_multi_level_cls_convs_2_0_bn_running_var, ema_bbox_head_multi_level_cls_convs_2_0_bn_num_batches_tracked, ema_bbox_head_multi_level_cls
    _convs_2_1_conv_weight, ema_bbox_head_multi_level_cls_convs_2_1_bn_weight, ema_bbox_head_multi_level_cls_convs_2_1_bn_bias, ema_bbox_head_multi_level_cls_convs_2_1_bn_running_mean, ema_bbox_head_multi_level_cls_convs_2_1_bn_running_var, ema_bbox_head_multi_level_cls_convs_2_1_bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_0_0_conv_weight, ema_bbox_head_multi_level_reg_convs_0_0_bn_weight, ema_bbox_head_mul
    ti_level_reg_convs_0_0_bn_bias, ema_bbox_head_multi_level_reg_convs_0_0_bn_running_mean, ema_bbox_head_multi_level_reg_convs_0_0_bn_running_var, ema_bbox_head_multi_level_reg_convs_0_0_bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_0_1_conv_weight, ema_bbox_head_multi_level_reg_convs_0_1_bn_weight, ema_bbox_head_multi_level_reg_convs_0_1_bn_bias, ema_bbox_head_multi_level_reg_convs_0_1_bn_running_mean, ema
    _bbox_head_multi_level_reg_convs_0_1_bn_running_var, ema_bbox_head_multi_level_reg_convs_0_1_bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_1_0_conv_weight, ema_bbox_head_multi_level_reg_convs_1_0_bn_weight, ema_bbox_head_multi_level_reg_convs_1_0_bn_bias, ema_bbox_head_multi_level_reg_convs_1_0_bn_running_mean, ema_bbox_head_multi_level_reg_convs_1_0_bn_running_var, ema_bbox_head_multi_level_reg_convs_1_0
    _bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_1_1_conv_weight, ema_bbox_head_multi_level_reg_convs_1_1_bn_weight, ema_bbox_head_multi_level_reg_convs_1_1_bn_bias, ema_bbox_head_multi_level_reg_convs_1_1_bn_running_mean, ema_bbox_head_multi_level_reg_convs_1_1_bn_running_var, ema_bbox_head_multi_level_reg_convs_1_1_bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_2_0_conv_weight, ema_bbox_head_
    multi_level_reg_convs_2_0_bn_weight, ema_bbox_head_multi_level_reg_convs_2_0_bn_bias, ema_bbox_head_multi_level_reg_convs_2_0_bn_running_mean, ema_bbox_head_multi_level_reg_convs_2_0_bn_running_var, ema_bbox_head_multi_level_reg_convs_2_0_bn_num_batches_tracked, ema_bbox_head_multi_level_reg_convs_2_1_conv_weight, ema_bbox_head_multi_level_reg_convs_2_1_bn_weight, ema_bbox_head_multi_level_reg_convs_2_1_bn_bias, ema_bb
    ox_head_multi_level_reg_convs_2_1_bn_running_mean, ema_bbox_head_multi_level_reg_convs_2_1_bn_running_var, ema_bbox_head_multi_level_reg_convs_2_1_bn_num_batches_tracked, ema_bbox_head_multi_level_conv_cls_0_weight, ema_bbox_head_multi_level_conv_cls_0_bias, ema_bbox_head_multi_level_conv_cls_1_weight, ema_bbox_head_multi_level_conv_cls_1_bias, ema_bbox_head_multi_level_conv_cls_2_weight, ema_bbox_head_multi_level_conv
    _cls_2_bias, ema_bbox_head_multi_level_conv_reg_0_weight, ema_bbox_head_multi_level_conv_reg_0_bias, ema_bbox_head_multi_level_conv_reg_1_weight, ema_bbox_head_multi_level_conv_reg_1_bias, ema_bbox_head_multi_level_conv_reg_2_weight, ema_bbox_head_multi_level_conv_reg_2_bias, ema_bbox_head_multi_level_conv_obj_0_weight, ema_bbox_head_multi_level_conv_obj_0_bias, ema_bbox_head_multi_level_conv_obj_1_weight, ema_bbox_head_multi_level_conv_obj_1_bias, ema_bbox_head_multi_level_conv_obj_2_weight, ema_bbox_head_multi_level_conv_obj_2_bias
    
    Registry:{}
    Registry:{'img_scale': (640, 640), 'flip': False, 'transforms': [{'type': 'Resize', 'keep_ratio': True}, {'type': 'RandomFlip'}, {'type': 'Pad', 'pad_to_square': True, 'pad_val': {'img': (114.0, 114.0, 114.0)}}, {'type': 'DefaultFormatBundle'}, {'type': 'Collect', 'keys': ['img']}]}
    Registry:{'keep_ratio': True}
    Registry:{}
    Registry:{'pad_to_square': True, 'pad_val': {'img': (114.0, 114.0, 114.0)}}
    Registry:{}
    Registry:{'keys': ['img']}
    2022-07-21 09:57:54,053 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
    2022-07-21 09:57:54,053 - mmdeploy - INFO - Export PyTorch model to ONNX: /static/work_dirs/bcccd9e0-41a1-408d-9dfa-f4e634e9608c\end2end.onnx.
    D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\core\optimizers\function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      ys_shape = tuple(int(s) for s in ys.shape)
    D:\Anaconda3\envs\aoc\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2228.)
      return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\codebase\mmdet\core\post_processing\bbox_nms.py:259: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      dets, labels = TRTBatchedNMSop.apply(boxes, scores, int(scores.shape[-1]),
    D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\mmcv\ops\nms.py:178: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      out_boxes = min(num_boxes, after_topk)
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    2022-07-21 09:58:17,235 - mmdeploy - INFO - Execute onnx optimize passes.
    2022-07-21 09:58:17,235 - mmdeploy - WARNING - Can not optimize model, please build torchscipt extension.
    More details: https://github.com/open-mmlab/mmdeploy/blob/master/docs/en/experimental/onnx_optimizer.md
    2022-07-21 09:58:21,554 - mmdeploy - INFO - Start pipeline mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt in subprocess
    2022-07-21 09:58:22,324 - mmdeploy - INFO - Successfully loaded tensorrt plugins from D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\lib\mmdeploy_tensorrt_ops.dll
    [07/21/2022-09:58:25] [TRT] [I] [MemUsageChange] Init CUDA: CPU +198, GPU +0, now: CPU 10828, GPU 990 (MiB)
    [07/21/2022-09:58:27] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +6, GPU +2, now: CPU 11015, GPU 992 (MiB)
    [07/21/2022-09:58:27] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
    [07/21/2022-09:58:28] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
    [07/21/2022-09:58:28] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
    [07/21/2022-09:58:28] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
    [07/21/2022-09:58:28] [TRT] [I] No importer registered for op: TRTBatchedNMS. Attempting to import as plugin.
    [07/21/2022-09:58:28] [TRT] [I] Searching for plugin: TRTBatchedNMS, plugin_version: 1, plugin_namespace:
    [07/21/2022-09:58:28] [TRT] [I] Successfully created plugin: TRTBatchedNMS
    [07/21/2022-09:58:28] [TRT] [W] FP16 support requested on hardware without native FP16 support, performance will be negatively affected.
    [07/21/2022-09:58:30] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +211, GPU +74, now: CPU 11542, GPU 1066 (MiB)
    [07/21/2022-09:58:31] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +171, GPU +80, now: CPU 11713, GPU 1146 (MiB)
    [07/21/2022-09:58:31] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.2.0
    [07/21/2022-09:58:31] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
    [07/21/2022-09:58:31] [TRT] [E] 4: [shapeCompiler.cpp::nvinfer1::builder::DynamicSlotBuilder::evaluateShapeChecks::911] Error Code 4: Internal Error (kOPT values for profile 0 violate shape constraints: condition '==' violated. 6400 != 16800. Concat_505: dimensions not compatible for concatenation)
    Process Process-3:
    Traceback (most recent call last):
      File "D:\Anaconda3\envs\aoc\lib\multiprocessing\process.py", line 315, in _bootstrap
        self.run()
      File "D:\Anaconda3\envs\aoc\lib\multiprocessing\process.py", line 108, in run
        self._target(*self._args, **self._kwargs)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
        ret = func(*args, **kwargs)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\backend\tensorrt\onnx2tensorrt.py", line 79, in onnx2tensorrt
        from_onnx(
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\backend\tensorrt\utils.py", line 153, in from_onnx
        assert engine is not None, 'Failed to create TensorRT engine'
    AssertionError: Failed to create TensorRT engine
    2022-07-21 09:58:31,918 - mmdeploy - ERROR - `mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt` with Call id: 1 failed. exit.
    [2022-07-21 09:58:31,919] ERROR:huey.consumer:Worker-1:Process Worker-1 died!
    Traceback (most recent call last):
      File "D:\Anaconda3\envs\aoc\lib\site-packages\huey\consumer.py", line 356, in _run
        process.loop()
      File "D:\Anaconda3\envs\aoc\lib\site-packages\huey\consumer.py", line 117, in loop
        self.huey.execute(task, now)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\huey\api.py", line 362, in execute
        return self._execute(task, timestamp)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\huey\api.py", line 379, in _execute
        task_value = task.execute()
      File "D:\Anaconda3\envs\aoc\lib\site-packages\huey\api.py", line 772, in execute
        return func(*args, **kwargs)
      File "D:\workspace\python\ai-online-core\publish.py", line 153, in deploy
        onnx2tensorrt(
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 356, in _wrap
        return self.call_function(func_name_, *args, **kwargs)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 324, in call_function
        return self.get_result_sync(call_id)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 305, in get_result_sync
        ret = self.get_caller(func_name).pop_mp_output(call_id)
      File "D:\Anaconda3\envs\aoc\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 82, in pop_mp_output
        exit()
      File "D:\Anaconda3\envs\aoc\lib\_sitebuiltins.py", line 26, in __call__
        raise SystemExit(code)
    SystemExit: None
    [2022-07-21 09:58:34,199] WARNING:huey.consumer:MainThread:Worker 1 died, restarting.
    
    opened by gzxy-0102 36
  • FPS of tensorrt inference is higher than .pth,but the inference total time is longer

    FPS of tensorrt inference is higher than .pth,but the inference total time is longer

    Hello, I run mmdeploy/tools/profiler.py script to get the inference speed both of tensorrt engine and .pth model. FPS of tensorrt inference is higher than .pth,but the inference total time is longer. Is it normal or something wrong? Here is how I get the total inference time: image Here is the speed of tensorrt engine: image Here is the speed of .pth model: image

    Here is my environment: image Looking forward to your reply!

    opened by wulouzhu 35
  • Slow inference speed Swin-Transformer: Pytorch to ONNX and TensorRT

    Slow inference speed Swin-Transformer: Pytorch to ONNX and TensorRT

    I convert Swin-Transformer from Pytorch to ONNX and TensorRT and got slow speed: Pytorch: 0.6s TensorRT:0.59s ONNX: 0.92s Here is my configure:

    _base_ = [
        '../_base_/base_instance-seg_dynamic.py',
        '../../_base_/backends/tensorrt-int8.py'
    ]
    
    backend_config = dict(
        common_config=dict(max_workspace_size=1 << 60),
        model_inputs=[
            dict(
                input_shapes=dict(
                    input=dict(
                        min_shape=[1, 3, 320, 320],
                        opt_shape=[1, 3, 800, 1344],
                        max_shape=[3, 3, 1344, 1344])))
        ])
    
    python3 ./tools/deploy.py ./configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py ../mmdetection/configs/insurance/cascade_mask_rcnn_swinB.py ../mmdetection/pretrained/cascade_mask_rcnn_swinB.pth ./demo/demo.jpg --device cuda:0
    
    opened by manhtd98 21
  • a problem occured when calling mmdeploy_detector_create_by_path

    a problem occured when calling mmdeploy_detector_create_by_path

    When I call the function mmdeploy_detector_create_by_path, setting model_path by the ONNX model path, a problem occured: no ModelImpl can read sdk_model. Is model_path not the ONNX model path?What the model_path should be?

    SDK 
    opened by TTgogogo 21
  • Upgrade Dockerfile to use TensorRT==8.2.4.2

    Upgrade Dockerfile to use TensorRT==8.2.4.2

    Motivation

    Currently, many documents seem to suppose that the user installed TRT >= 8.*. But, Dockerfile does not follow it, therefore related issues happen when using a given Dockerfile.

    Modification

    • Upgrade the base image to use TensorRT 8.2.4.2. See release note.
      • Fix wrong environment variable for LD_LIBRARY_PATH (the image does not include /usr/local/cuda-11.3/).
    • Upgrade torch & torchvision
    • Upgrade MMCV

    BC-breaking (Optional)

    Does the modification introduce changes that break the backward-compatibility of the downstream repositories? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

    Use cases (Optional)

    If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

    Checklist

    1. Pre-commit or other linting tools are used to fix the potential lint issues.
    2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
    3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
    4. The documentation has been modified accordingly, like docstring or example tutorials.
    opened by nijkah 21
  • Tensor size mismatch

    Tensor size mismatch

    Hello, I am trying to convert a CascadeNet pth model into onnx and getting the same error:

    [email protected]:~/workspace/mmdeploy# python ./tools/deploy.py configs/mmdet/detection/detection_onnxruntime_dynamic.py cascade_mask_rcnn_hrnetv2p_w32_20e.py General.Model.table.detection.v2.pth demo.png [2022-05-19 16:21:23.549] [mmdeploy] [info] Register 'DirectoryModel' 2022-05-19 16:21:23,618 - mmdeploy - INFO - torch2onnx start. [2022-05-19 16:21:24.866] [mmdeploy] [info] Register 'DirectoryModel' /opt/conda/lib/python3.7/site-packages/mmdet/models/builder.py:53: UserWarning: train_cfg and test_cfg is deprecated, please specify them in model 'please specify them in model', UserWarning) load checkpoint from local path: General.Model.table.detection.v2.pth /opt/conda/lib/python3.7/site-packages/mmdet/datasets/utils.py:69: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file. 'data pipeline in your config file.', UserWarning) 2022-05-19 16:21:32,430 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 2022-05-19:16:21:32,mmdeploy WARNING [utils.py:92] DeprecationWarning: get_onnx_config will be deprecated in the future. /root/workspace/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! ys_shape = tuple(int(s) for s in ys.shape) /opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:3455: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode) /opt/conda/lib/python3.7/site-packages/mmdet/models/dense_heads/anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead warnings.warn('DeprecationWarning: anchor_generator is deprecated, ' /opt/conda/lib/python3.7/site-packages/mmdet/core/anchor/anchor_generator.py:333: UserWarning: grid_anchors would be deprecated soon. Please use grid_priors warnings.warn('grid_anchors would be deprecated soon. ' /opt/conda/lib/python3.7/site-packages/mmdet/core/anchor/anchor_generator.py:370: UserWarning: single_level_grid_anchors would be deprecated soon. Please use single_level_grid_priors 'single_level_grid_anchors would be deprecated soon. ' /root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/dense_heads/rpn_head.py:78: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cls_score.size()[-2:] == bbox_pred.size()[-2:] /root/workspace/mmdeploy/mmdeploy/pytorch/functions/topk.py:28: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. k = torch.tensor(k, device=input.device, dtype=torch.long) /root/workspace/mmdeploy/mmdeploy/pytorch/functions/topk.py:33: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! return ctx.origin_func(input, k, dim=dim, largest=largest, sorted=sorted) /opt/conda/lib/python3.7/site-packages/mmdet/core/bbox/coder/legacy_delta_xywh_bbox_coder.py:77: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert pred_bboxes.size(0) == bboxes.size(0) 2022-05-19:16:21:35,root ERROR [utils.py:43] The size of tensor a (4) must match the size of tensor b (4432) at non-singleton dimension 2 Traceback (most recent call last): File "/root/workspace/mmdeploy/mmdeploy/utils/utils.py", line 38, in target_wrapper result = target(*args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/apis/pytorch2onnx.py", line 113, in torch2onnx output_file=output_file) File "/root/workspace/mmdeploy/mmdeploy/apis/pytorch2onnx.py", line 55, in torch2onnx_impl verbose=verbose) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/init.py", line 276, in export custom_opsets, enable_onnx_checker, use_external_data_format) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/utils.py", line 94, in export use_external_data_format=use_external_data_format) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/utils.py", line 698, in _export dynamic_axes=dynamic_axes) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/utils.py", line 456, in _model_to_graph use_new_jit_passes) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/utils.py", line 417, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.7/site-packages/torch/onnx/utils.py", line 377, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py", line 1139, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py", line 130, in forward self._force_outplace, File "/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py", line 116, in wrapper outs.append(self.inner(*trace_inputs)) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 887, in _call_impl result = self._slow_forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 860, in _slow_forward result = self.forward(*input, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/core/rewriters/rewriter_utils.py", line 371, in wrapper return self.func(self, *args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/base.py", line 69, in base_detector__forward return __forward_impl(ctx, self, img, img_metas=img_metas, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/core/optimizers/function_marker.py", line 261, in g rets = f(*args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/base.py", line 28, in __forward_impl return self.simple_test(img, img_metas, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/core/rewriters/rewriter_utils.py", line 371, in wrapper return self.func(self, *args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/two_stage.py", line 58, in two_stage_detector__simple_test proposals, _ = self.rpn_head.simple_test_rpn(x, img_metas) File "/opt/conda/lib/python3.7/site-packages/mmdet/models/dense_heads/dense_test_mixins.py", line 130, in simple_test_rpn proposal_list = self.get_bboxes(*rpn_outs, img_metas=img_metas) File "/root/workspace/mmdeploy/mmdeploy/core/rewriters/rewriter_utils.py", line 371, in wrapper return self.func(self, *args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/codebase/mmdet/models/dense_heads/rpn_head.py", line 122, in rpn_head__get_bboxes max_shape=img_metas[0]['img_shape']) File "/opt/conda/lib/python3.7/site-packages/mmdet/core/bbox/coder/legacy_delta_xywh_bbox_coder.py", line 79, in decode self.stds, max_shape, wh_ratio_clip) File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/parrots_jit.py", line 22, in wrapper_inner return func(*args, **kargs) File "/opt/conda/lib/python3.7/site-packages/mmdet/core/bbox/coder/legacy_delta_xywh_bbox_coder.py", line 181, in legacy_delta2bbox denorm_deltas = deltas * stds + means RuntimeError: The size of tensor a (4) must match the size of tensor b (4432) at non-singleton dimension 2 2022-05-19 16:21:36,282 - mmdeploy - ERROR - torch2onnx failed.

    Does anyone know what the problem might be? Thanks in advance!

    opened by saffie91 21
  • MMdeploy build & install script + prerequisites

    MMdeploy build & install script + prerequisites

    Hello,

    Over the last couple of months, I have been working on a bash script for building and installing MMDeploy and prerequisites to ease the install procedure when we use MMDetection, MMDeploy tools. Maybe this script can serve as a starting point for an easier install procedure when building and installing MMDeploy from source.

    Naturally, this script is a work in progress. There are probably parts that can be cleaned up - and maybe some dependencies are not needed to install anymore.

    Let me know if there are any suggestions and ideas on how to integrate this script with the MMDeploy

    Note:

    1. Only tested with "standard" python virtual environment, as I do not use Conda
    2. Only tested on linux
    3. Only supports TensorRT build at this point.
    4. The script should work on both x86_64 and aarch64 (jetson) platforms.
    opened by tehkillerbee 21
  • Inference error

    Inference error

    Hi: I have converted faster-rcnn model downloaded from mmdetection zoo to trt engine sucessfully, but when I run inference_model the error happened: [2022-04-22 07:27:52.715] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel' 2022-04-22 07:27:57,889 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /root/workspace/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so 2022-04-22 07:27:57,889 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /root/workspace/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so /opt/conda/lib/python3.8/site-packages/mmdet-2.22.0-py3.8.egg/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recomm ended to manually replace it in the test data pipeline in your config file. warnings.warn( #assertion/root/workspace/mmdeploy/csrc/backend_ops/tensorrt/batched_nms/trt_batched_nms.cpp,98 Aborted (core dumped)

    Could you please tell me why it happend and how to deal with it? Thank you.

    TensorRT 
    opened by wulouzhu 21
  • On the Jetson platform, the problem of

    On the Jetson platform, the problem of "AssertionError: Failed to create TensorRT engine"

    When i run the demo on the jetson: python ./tools/deploy.py configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py /home/jetson/mmdetection/configs/retinanet/retinanet_r18_fpn_1x_coco.py /home/jetson/mmdetection/checkpoints/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth /home/jetson/mmdetection/demo/demo.jpg --work-dir work_dir --show --device cuda:0 --dump-info

    it occur this problem:

    [2022-09-17 14:18:01.468] [mmdeploy] [info] [model.cpp:98] Register 'DirectoryModel'
    [2022-09-17 14:18:07.267] [mmdeploy] [info] [model.cpp:98] Register 'DirectoryModel'
    /home/jetson/mmdetection/mmdet/datasets/utils.py:70: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
      'data pipeline in your config file.', UserWarning)
    [2022-09-17 14:18:30.458] [mmdeploy] [info] [model.cpp:98] Register 'DirectoryModel'
    2022-09-17 14:18:30,474 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
    load checkpoint from local path: /home/jetson/mmdetection/checkpoints/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth
    /home/jetson/mmdetection/mmdet/datasets/utils.py:70: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
      'data pipeline in your config file.', UserWarning)
    2022-09-17 14:18:53,964 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
    2022-09-17 14:18:53,967 - mmdeploy - INFO - Export PyTorch model to ONNX: work_dir/end2end.onnx.
    2022-09-17 14:18:54,050 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied
    /home/jetson/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      ys_shape = tuple(int(s) for s in ys.shape)
    /home/jetson/mmdeploy/mmdeploy/codebase/mmdet/models/dense_heads/base_dense_head.py:96: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      assert cls_score.size()[-2:] == bbox_pred.size()[-2:]
    /home/jetson/mmdeploy/mmdeploy/pytorch/functions/topk.py:57: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      if k > size:
    /home/jetson/mmdeploy/mmdeploy/codebase/mmdet/core/bbox/delta_xywh_bbox_coder.py:39: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      assert pred_bboxes.size(0) == bboxes.size(0)
    /home/jetson/mmdeploy/mmdeploy/codebase/mmdet/core/bbox/delta_xywh_bbox_coder.py:41: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      assert pred_bboxes.size(1) == bboxes.size(1)
    /home/jetson/mmdeploy/mmdeploy/codebase/mmdet/deploy/utils.py:93: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
      assert len(max_shape) == 2, '`max_shape` should be [h, w]'
    /home/jetson/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:260: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      dets, labels = TRTBatchedNMSop.apply(boxes, scores, int(scores.shape[-1]),
    /home/jetson/mmdeploy/mmdeploy/mmcv/ops/nms.py:178: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      out_boxes = min(num_boxes, after_topk)
    /home/jetson/mmdeploy/mmdeploy/mmcv/ops/nms.py:181: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      (batch_size, out_boxes)).to(scores.device))
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    /home/jetson/archiconda3/envs/mmdeploy/lib/python3.6/site-packages/torch/onnx/symbolic_opset9.py:2819: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
      "If indices include negative values, the exported graph will produce incorrect results.")
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    2022-09-17 14:19:09,799 - mmdeploy - INFO - Execute onnx optimize passes.
    2022-09-17 14:19:10,942 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
    [2022-09-17 14:19:18.405] [mmdeploy] [info] [model.cpp:98] Register 'DirectoryModel'
    2022-09-17 14:19:18,421 - mmdeploy - INFO - Start pipeline mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt in subprocess
    2022-09-17 14:19:18,731 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /home/jetson/mmdeploy/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
    [09/17/2022-14:19:20] [TRT] [I] [MemUsageChange] Init CUDA: CPU +355, GPU +0, now: CPU 441, GPU 7993 (MiB)
    [09/17/2022-14:19:20] [TRT] [I] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 441 MiB, GPU 8022 MiB
    [09/17/2022-14:19:21] [TRT] [I] [MemUsageSnapshot] End constructing builder kernel library: CPU 546 MiB, GPU 8128 MiB
    [09/17/2022-14:19:21] [TRT] [W] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
    [09/17/2022-14:19:21] [TRT] [W] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
    [09/17/2022-14:19:21] [TRT] [I] No importer registered for op: GridPriorsTRT. Attempting to import as plugin.
    [09/17/2022-14:19:21] [TRT] [I] Searching for plugin: GridPriorsTRT, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:21] [TRT] [I] Successfully created plugin: GridPriorsTRT
    [09/17/2022-14:19:21] [TRT] [I] No importer registered for op: GridPriorsTRT. Attempting to import as plugin.
    [09/17/2022-14:19:21] [TRT] [I] Searching for plugin: GridPriorsTRT, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:21] [TRT] [I] Successfully created plugin: GridPriorsTRT
    [09/17/2022-14:19:21] [TRT] [I] No importer registered for op: GridPriorsTRT. Attempting to import as plugin.
    [09/17/2022-14:19:21] [TRT] [I] Searching for plugin: GridPriorsTRT, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:21] [TRT] [I] Successfully created plugin: GridPriorsTRT
    [09/17/2022-14:19:21] [TRT] [I] No importer registered for op: GridPriorsTRT. Attempting to import as plugin.
    [09/17/2022-14:19:21] [TRT] [I] Searching for plugin: GridPriorsTRT, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:21] [TRT] [I] Successfully created plugin: GridPriorsTRT
    [09/17/2022-14:19:21] [TRT] [I] No importer registered for op: GridPriorsTRT. Attempting to import as plugin.
    [09/17/2022-14:19:21] [TRT] [I] Searching for plugin: GridPriorsTRT, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:21] [TRT] [I] Successfully created plugin: GridPriorsTRT
    [09/17/2022-14:19:24] [TRT] [I] No importer registered for op: TRTBatchedNMS. Attempting to import as plugin.
    [09/17/2022-14:19:24] [TRT] [I] Searching for plugin: TRTBatchedNMS, plugin_version: 1, plugin_namespace:
    [09/17/2022-14:19:24] [TRT] [I] Successfully created plugin: TRTBatchedNMS
    [09/17/2022-14:19:24] [TRT] [W] DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
    [09/17/2022-14:19:24] [TRT] [I] ---------- Layers Running on DLA ----------
    [09/17/2022-14:19:24] [TRT] [I] ---------- Layers Running on GPU ----------
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Range_255
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_8 + Relu_9
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] [HostToDeviceCopy]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] MaxPool_10
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_11 + Relu_12
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_13 + Add_14 + Relu_15
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_16 + Relu_17
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_18 + Add_19 + Relu_20
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_21 + Relu_22
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_23
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_24 + Add_25 + Relu_26
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_27 + Relu_28
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_29 + Add_30 + Relu_31
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_32 + Relu_33
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_34
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_35 + Add_36 + Relu_37
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_38 + Relu_39
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_40 + Add_41 + Relu_42
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_43 + Relu_44
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_45
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_46 + Add_47 + Relu_48
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_49 + Relu_50
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_51 + Add_52 + Relu_53
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_77
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_141 + Relu_142 || Conv_133 + Relu_134
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_78
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_159 + Relu_160 || Conv_151 + Relu_152
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_56
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Resize_64
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_76
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_123 + Relu_124 || Conv_115 + Relu_116
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_55 + Add_65
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_117 + Relu_118
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_125 + Relu_126
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Resize_72
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_75
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_105 + Relu_106 || Conv_97 + Relu_98
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_153 + Relu_154
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_161 + Relu_162
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_135 + Relu_136
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_143 + Relu_144
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_54 + Add_73
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_145 + Relu_146
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_137 + Relu_138
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_163 + Relu_164
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_155 + Relu_156
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_99 + Relu_100
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_107 + Relu_108
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_74
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_87 + Relu_88 || Conv_79 + Relu_80
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_127 + Relu_128
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_119 + Relu_120
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_121 + Relu_122
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_129 + Relu_130
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_81 + Relu_82
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_89 + Relu_90
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_109 + Relu_110
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_101 + Relu_102
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_157 + Relu_158
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_165 + Relu_166
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_139 + Relu_140
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_147 + Relu_148
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_150
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_401 + Reshape_402
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_149
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_398 + Reshape_399
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_168
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_456 + Reshape_457
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_167
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_453 + Reshape_454
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_103 + Relu_104
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_111 + Relu_112
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_91 + Relu_92
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_83 + Relu_84
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_132
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_346 + Reshape_347
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_131
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_343 + Reshape_344
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 520) [Constant] + (Unnamed Layer* 521) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_355
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Sigmoid_345)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 534) [Constant] + (Unnamed Layer* 535) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_363
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(479_41 + (Unnamed Layer* 539) [Shuffle], Add_364)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 621) [Constant] + (Unnamed Layer* 622) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_410
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Sigmoid_400)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 635) [Constant] + (Unnamed Layer* 636) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_418
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(479_56 + (Unnamed Layer* 640) [Shuffle], Add_419)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 722) [Constant] + (Unnamed Layer* 723) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_465
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Sigmoid_455)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 736) [Constant] + (Unnamed Layer* 737) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_473
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(479_71 + (Unnamed Layer* 741) [Shuffle], Add_474)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_85 + Relu_86
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_93 + Relu_94
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_114
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_291 + Reshape_292
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_113
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_288 + Reshape_289
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 859 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 896 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 867 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 881 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 755 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 792 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 763 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 777 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 651 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 688 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 659 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 673 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 419) [Constant] + (Unnamed Layer* 420) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_300
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Sigmoid_290)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 433) [Constant] + (Unnamed Layer* 434) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_308
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(479_26 + (Unnamed Layer* 438) [Shuffle], Add_309)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ReduceMax_476
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ReduceMax_421
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ReduceMax_366
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TopK_367
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TopK_422
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TopK_477
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_96
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_228 + Reshape_230
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Conv_95
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Transpose_223 + Reshape_226
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 547 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 584 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 555 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 569 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 304) [Constant] + (Unnamed Layer* 305) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_239
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Sigmoid_227)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 318) [Constant] + (Unnamed Layer* 319) [Shuffle]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_247
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(479 + (Unnamed Layer* 323) [Shuffle], Add_249)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ReduceMax_311
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TopK_312
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 443 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 480 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 451 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 465 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ReduceMax_251
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TopK_252
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] {ForeignNode[Unsqueeze_256...Slice_524]}
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 397
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 196) [Shuffle]_output[Constant]
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_189
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_190
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 403
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_195
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_196
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 421
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_213
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_214
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 409
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_201
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_202
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 415
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_207
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] ConstantOfShape_208
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] GridPriorsTRT_209
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] GridPriorsTRT_203
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] GridPriorsTRT_215
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] GridPriorsTRT_197
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] GridPriorsTRT_191
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 452
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Unsqueeze_216
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 452_19
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Unsqueeze_217
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 452_34
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Unsqueeze_218
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 452_49
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Unsqueeze_219
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 452_64
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] Unsqueeze_220
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 431 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 711) [Constant]_output copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 430 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 610) [Constant]_output copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 429 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 509) [Constant]_output copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 428 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 408) [Constant]_output copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] 427 copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] (Unnamed Layer* 293) [Constant]_output copy
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] PWN(Clip_533)
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] {ForeignNode[Flatten_280...Concat_558]}
    [09/17/2022-14:19:24] [TRT] [I] [GpuLayer] TRTBatchedNMS_563
    [09/17/2022-14:19:25] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +226, GPU +285, now: CPU 962, GPU 8608 (MiB)
    [09/17/2022-14:19:25] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
    [09/17/2022-14:19:25] [TRT] [E] 2: [utils.cpp::checkMemLimit::380] Error Code 2: Internal Error (Assertion upperBound != 0 failed. Unknown embedded device detected. Please update the table with the entry: {{1794, 6, 16}, 12653},)
    Process Process-3:
    Traceback (most recent call last):
      File "/home/jetson/archiconda3/envs/mmdeploy/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
        self.run()
      File "/home/jetson/archiconda3/envs/mmdeploy/lib/python3.6/multiprocessing/process.py", line 93, in run
        self._target(*self._args, **self._kwargs)
      File "/home/jetson/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
        ret = func(*args, **kwargs)
      File "/home/jetson/mmdeploy/mmdeploy/backend/tensorrt/onnx2tensorrt.py", line 88, in onnx2tensorrt
        device_id=device_id)
      File "/home/jetson/mmdeploy/mmdeploy/backend/tensorrt/utils.py", line 215, in from_onnx
        assert engine is not None, 'Failed to create TensorRT engine'
    AssertionError: Failed to create TensorRT engine
    2022-09-17 14:19:26,458 - mmdeploy - ERROR - `mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt` with Call id: 1 failed. exit.
    

    And i run python /tools/check.env

    2022-09-17 14:40:59,038 - mmdeploy - INFO -
    
    2022-09-17 14:40:59,039 - mmdeploy - INFO - **********Environmental information**********
    fatal: not a git repository (or any of the parent directories): .git
    2022-09-17 14:41:04,289 - mmdeploy - INFO - sys.platform: linux
    2022-09-17 14:41:04,290 - mmdeploy - INFO - Python: 3.6.15 | packaged by conda-forge | (default, Dec  3 2021, 19:12:04) [GCC 9.4.0]
    2022-09-17 14:41:04,291 - mmdeploy - INFO - CUDA available: True
    2022-09-17 14:41:04,291 - mmdeploy - INFO - GPU 0: Xavier
    2022-09-17 14:41:04,291 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda-10.2
    2022-09-17 14:41:04,291 - mmdeploy - INFO - NVCC: Build cuda_10.2_r440.TC440_70.29663091_0
    2022-09-17 14:41:04,291 - mmdeploy - INFO - GCC: gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
    2022-09-17 14:41:04,291 - mmdeploy - INFO - PyTorch: 1.10.0
    2022-09-17 14:41:04,291 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
      - GCC 7.5
      - C++ Version: 201402
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - LAPACK is enabled (usually provided by MKL)
      - NNPACK is enabled
      - CPU capability usage: NO AVX
      - CUDA Runtime 10.2
      - NVCC architecture flags: -gencode;arch=compute_53,code=sm_53;-gencode;arch=compute_62,code=sm_62;-gencode;arch=compute_72,code=sm_72
      - CuDNN 8.2.1
        - Built with CuDNN 8.0
      - Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=8.0.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -W                                        no-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_XNNPACK -DSYMBOLICA                                        TE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initiali                                        zers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-fu                                        nction -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-s                                        tringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligne                                        d-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -DMISSING_ARM_VST1 -DMISSI                                        NG_ARM_VLD1 -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EI                                        GEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, USE_NNPACK=ON,                                         USE_OPENMP=ON,
    
    2022-09-17 14:41:04,292 - mmdeploy - INFO - TorchVision: 0.11.1
    2022-09-17 14:41:04,292 - mmdeploy - INFO - OpenCV: 4.6.0
    2022-09-17 14:41:04,292 - mmdeploy - INFO - MMCV: 1.4.0
    2022-09-17 14:41:04,292 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
    2022-09-17 14:41:04,292 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
    2022-09-17 14:41:04,292 - mmdeploy - INFO - MMDeploy: 0.8.0+
    2022-09-17 14:41:04,292 - mmdeploy - INFO -
    
    2022-09-17 14:41:04,293 - mmdeploy - INFO - **********Backend information**********
    2022-09-17 14:41:06,396 - mmdeploy - INFO - onnxruntime: None   ops_is_avaliable : False
    2022-09-17 14:41:06,491 - mmdeploy - INFO - tensorrt: 8.2.1.8   ops_is_avaliable : True
    2022-09-17 14:41:06,547 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
    2022-09-17 14:41:06,550 - mmdeploy - INFO - pplnn_is_avaliable: False
    2022-09-17 14:41:06,554 - mmdeploy - INFO - openvino_is_avaliable: False
    2022-09-17 14:41:06,607 - mmdeploy - INFO - snpe_is_available: False
    2022-09-17 14:41:06,613 - mmdeploy - INFO - ascend_is_available: False
    2022-09-17 14:41:06,616 - mmdeploy - INFO - coreml_is_available: False
    2022-09-17 14:41:06,616 - mmdeploy - INFO -
    
    2022-09-17 14:41:06,616 - mmdeploy - INFO - **********Codebase information**********
    2022-09-17 14:41:06,621 - mmdeploy - INFO - mmdet:      2.25.2
    2022-09-17 14:41:06,621 - mmdeploy - INFO - mmseg:      None
    2022-09-17 14:41:06,621 - mmdeploy - INFO - mmcls:      None
    2022-09-17 14:41:06,621 - mmdeploy - INFO - mmocr:      None
    2022-09-17 14:41:06,622 - mmdeploy - INFO - mmedit:     None
    2022-09-17 14:41:06,622 - mmdeploy - INFO - mmdet3d:    None
    2022-09-17 14:41:06,622 - mmdeploy - INFO - mmpose:     None
    2022-09-17 14:41:06,622 - mmdeploy - INFO - mmrotate:   None
    

    Thank you!!!!!

    opened by lijoe123 20
  • Accrording the get_started.md, it occur the problem: cpu is invalid for the backend tensort

    Accrording the get_started.md, it occur the problem: cpu is invalid for the backend tensort

    Hello, this is my first time to use the mmdeploy, i follow the get_started.md to convert model with Linux-x86_64, CUDA 11.x, TensorRT 8.2.5.1, just like the md show image

    but it occur the problem: ValueError: cpu is invalid for the backend tensort.

    By the way , my env as follow: 40A687CFDC1C729D708693F8848FAC0D 0F13DD7514ED2B443CBBD92E2C0D5D3B

    opened by lijoe123 20
Releases(v1.0.0rc1)
  • v1.0.0rc1(Dec 30, 2022)

    Features

    • Add profiler for SDK (#1446)
    • Support MMRotate 1.x (#1401)
    • Add YOLOv5 support for RV1126 device. (#1321)
    • Support Torch JIT Modulated Deformable Conv (#1536)
    • Support SOLO deployment with OpenVINO (#1454)
    • Support TVM (#1531)
    • Support Rotated RTMDet deployment (#1553)

    Improvements

    • Refactor SDK registry (#1368)
    • Avoid copying dense arrays in Python API (#1349)
    • Update dockerfile pip source (#1484)
    • Add rknn device check (#1363)
    • cherry-pick: Decouple preprocess operation and transformation (#1353)
    • sync #1493 to support TorchAllocator as TensorRT Gpu Allocator and fix DCNv2 tensorrt plugin error (#1519)
    • Add md link check github action (#1320)
    • Remove cudnn dependency for transform 'mmaction2::format_shape' (#1509)
    • Refactor rewriter context for MMRazor (#1483)
    • Add is_batched argument to pipeline.json (#1528)
    • Refactor Backend Manager(#1515)

    Bug fixes

    • Support ONNXRuntime-1.13 (#1407)
    • Fix det->pose demo (#1419)
    • Fix MMOCR import typing error (#1497)
    • Fix requirements of MMEditing (#1496)
    • recovery mmdet layers (#1526)
    • Fix squeeze export and unsqueeze pass for opset 13 (#1538)
    • Fix 'cannot seek vector iterator' in debug windows build (#1543)
    • Fix API build error in readthedocs (#1567)
    • Fix unittest and suppress warning (#1552)
    • Rename 'forward_test' to 'predict' (#1561)
    • Fix onnx2ncnn.cpp bugs (#1518)

    Document

    • Add mmaction2 & coreml index for readthedocs (#1542)

    Contributors

    @lzhangzz @lvhan028 @PeterH0323 @AllentDan @irexyc @RunningLeon @grimoire @hanrui1sensetime @pppppM @antoszy @DDGRCF @kota-iizuka @liuyanyi

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-1.0.0rc1-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(180.89 MB)
    mmdeploy-1.0.0rc1-linux-x86_64-onnxruntime1.8.1.tar.gz(74.10 MB)
    mmdeploy-1.0.0rc1-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(307.38 MB)
    mmdeploy-1.0.0rc1-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(349.80 MB)
    mmdeploy-1.0.0rc1-windows-amd64-onnxruntime1.8.1.zip(201.39 MB)
  • v0.12.0(Dec 30, 2022)

    Features

    • Support Torch JIT Modulated Deformable Conv (#1508)
    • Support TorchAllocator as TesnorRT GPU memory allocator (#1493)
    • Support TVM backend (#1216)
    • Support probability output for segmentation (#1379)

    Improvements

    • Add pip source in dockerfile (#1492)
    • Reformat multi-line logs and strings (#1489)
    • Refactor backend manager (#1475, #1522, #1540)
    • Add stale workflow to check issue and PR (#1504, #1510)
    • Update ppl.nn v0.9.1 and ppl.cv v0.7.1 (#1356)
    • Add is_batched argument to pipeline.json (#1560)
    • Build monolithic SDK by default (#1577)

    Bug fixes

    • Fix conversion and inference support for torch 1.13 (#1488)
    • Remove cudnn dependency for transform 'mmaction2::format_shape' (#1509)
    • Add build-arch option to build script (#1530)
    • Fix 'mmaction2::transpose.cu' build failed on cuda-10.2 (#1539)
    • Fix 'cannot seek vector iterator' in debug windows build (#1555)
    • Fix ops unittest seg-fault error (#1556)

    Document

    • Add mmaction2 sphinx-doc link (#1541)
    • Update FAQ about copying onnxruntime dll to 'mmdeploy/lib' (#1554)
    • Update support_new_backend.md (#1574)

    Contributors

    @PeterH0323 @grimoire @RunningLeon @irexyc @ouonline @tpoisonooo @antoszy @BuxianChen @AllentDan @lzhangzz @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.12.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(156.61 MB)
    mmdeploy-0.12.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(181.97 MB)
    mmdeploy-0.12.0-linux-x86_64-onnxruntime1.8.1.tar.gz(73.72 MB)
    mmdeploy-0.12.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(308.00 MB)
    mmdeploy-0.12.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(350.54 MB)
    mmdeploy-0.12.0-windows-amd64-onnxruntime1.8.1.zip(201.15 MB)
  • v1.0.0rc0(Dec 1, 2022)

    We are excited to announce the release of MMDeploy 1.0.0rc0. MMDeploy 1.0.0rc0 is the first version of MMDeploy 1.x, a part of the OpenMMLab 2.0 projects. Up to the release, MMDeploy 1.x supports OpenMMLab 2.0 based projects: MMCls 1.x, MMDet 3.x, MMDet3d 1.x, MMSeg 1.x, MMEdit 1.x, MMOCR 1.x, MMPose 1.x, MMAction2 1.x.

    Features

    • Support mmaction2 (#1012)
    • Support SimCC from mmpose (#1187)
    • Support RTMDet from MMDet (#1104)
    • Support CenterNet from MMDet (#1219)
    • Support MobileOne from MMCls (#1268)
    • Support external usage of MMYOLO (#1088)

    Improvements

    • Update dockerfiles (#1296)

    Bug fixes

    • Fix test ops (#1352)
    • Fix: checkpoint load on cpu (#1324)

    Document

    • Add MMYOLO desc in README (#1235)
    • Modify the links & fix some typos (#1150)

    Contributors

    @xin-li-67 @liu-mengyang @doufengqi @PeterH0323 @triple-Mu @MambaWong @isLinXu @francis0407 @sanbuphy @vansin @SsTtOoNnEe @RangiLyu @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-1.0.0rc0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(108.32 MB)
    mmdeploy-1.0.0rc0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(179.82 MB)
    mmdeploy-1.0.0rc0-linux-x86_64-cuda11.3-tensorrt8.2.3.0.tar.gz(139.48 MB)
    mmdeploy-1.0.0rc0-linux-x86_64-onnxruntime1.8.1.tar.gz(24.17 MB)
    mmdeploy-1.0.0rc0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(306.69 MB)
    mmdeploy-1.0.0rc0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(349.05 MB)
    mmdeploy-1.0.0rc0-windows-amd64-onnxruntime1.8.1.zip(200.38 MB)
  • v0.11.0(Dec 1, 2022)

    Features

    • Support MMaction2 TSN and SlowFast deployment with ONNXRuntime and TensorRT (#1183,#1410,#1455)
    • Support Rockchip device RV1126
      • Rewrite BaseDenseHead.get_bboxes to support SSD, FSAF and RetinaNet (#1203)
      • Add BaseDenseHead postprocessing in SDK (#1238)
      • Support YOLOv3 and YOLOv5 postprocessing in SDK (#1280,#1424)
    • Add SDK profiler (#1274)
    • Support end2end deployment for pointpillars & centerpoint(pillar)from MMDet3d (#1178)

    Improvements

    • Support loading TensorRT libnvinfer plugins (#1275)
    • Avoid copying dense arrays in SDK C API and Python API (#1261, #1349)
    • Add Core ML common configuration (#1308)
    • Refactor SDK registry (#1368)
    • Update regresssion test to serialize eval result into json (#1310)
    • Support onnxruntime-1.13 API(#1407)
    • Decouple preprocess operation and transformation (#1353)

    Bug fixes

    • Set stream argument when using async memcpy (#1314)
    • Use OpenCV with videoio enabled for aarch64 platform (#1343)
    • Fix(tools/scripts): find env file failed (#1385)
    • Fix ncnn-int8 config path (#1380)
    • Fix out-of-boundary issue in SDK when topk is larger than class_num (#1420)
    • Fix yolohead trt8.2 (#1433)
    • Fix pad_to_square (#1436)
    • Fix det_pose demo (#1419)
    • Relax module adapter template constraints (#1366)
    • Fix ncnn torch 1.12 master (#1430)
    • Avoid gpu topk const-fold (#1439)
    • Support .NET Framwork 4.8 and fix batch inference error(#1370)
    • Upgrade ncnn to 20221128 to resolve build error (#1459)

    Document

    • Add more images for demos and user guides (#1339)
    • Improve mmdet3d doc (#1394)
    • Display CI results in README (#1452)
    • Fix dead links in write_config.md (#1396)

    Contributors

    @xin-li-67 @sunjiahao1999 @francis0407 @Typiqally @triple-Mu @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.11.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(108.45 MB)
    mmdeploy-0.11.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(179.99 MB)
    mmdeploy-0.11.0-linux-x86_64-cuda11.3-tensorrt8.2.3.0.tar.gz(139.68 MB)
    mmdeploy-0.11.0-linux-x86_64-onnxruntime1.8.1.tar.gz(24.59 MB)
    mmdeploy-0.11.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(306.55 MB)
    mmdeploy-0.11.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(348.93 MB)
    mmdeploy-0.11.0-windows-amd64-onnxruntime1.8.1.zip(200.53 MB)
  • v0.10.0(Oct 31, 2022)

    Features

    • Support Monocular 3D Detection and FCOS3D Deployment (#1047)
    • Support MMEdit EDSR deployment with ncnn-int8 (#1111)
    • Rewrite Conv2dAdaptiveOps to support EfficientNet deployment (#1045)
    • Add installation scripts for Jetson Orin (#1105)
    • Support aarch64 cross compiler (#1126)

    Improvements

    • Support Fast-SCNN deployment with ncnn backend (#1094)
    • Ease rewriter import (#1166)
    • Support TensorRT 8.4 (#1144)
    • Remove extra domains after model extraction (#1207)
    • Add batch inference demos (#986)
    • update symbolic rewriter for latest PyTorch API (#1122)
    • Detect filesystem library in CMake (#1190)
    • compute per sample statistics when profiling in batch mode (#1158)
    • Add a device field for mmdeploy_mat_t (#1176)

    Before v0.10.0

    typedef struct mmdeploy_mat_t {
      uint8_t* data;
      int height;
      int width;
      int channel;
      mmdeploy_pixel_format_t format;
      mmdeploy_data_type_t type;
    } mmdeploy_mat_t;
    

    in v0.10.0

    typedef struct mmdeploy_mat_t {
      uint8_t* data;
      int height;
      int width;
      int channel;
      mmdeploy_pixel_format_t format;
      mmdeploy_data_type_t type;
      mmdeploy_device_t device;
    } mmdeploy_mat_t;
    

    Bug fixes

    • Fix test_windows_onnxruntime workflow error in circleci (#1254)
    • Fix build error when the target device is 'cuda' and the inference backend is 'onnxruntime-gpu' (#1253)
    • Fix layer_norm symbol error when exporting it with torch>=1.12 (#1168)
    • Fix regression test script errors (#1217, #1146)

    Document

    • Update supported backend logos in the cover of README (#1252)
    • Add a link to MMYOLO in README (#1235)

    Contributors

    @doufengqi @Qingrenn @liu-mengyang @SsTtOoNnEe @OldDreamInWind @sunjiahao1999 @LiuYi-Up @isLinXu @lansfair @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.10.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(148.91 MB)
    mmdeploy-0.10.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(174.07 MB)
    mmdeploy-0.10.0-linux-x86_64-onnxruntime1.8.1.tar.gz(67.32 MB)
    mmdeploy-0.10.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(290.71 MB)
    mmdeploy-0.10.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(333.13 MB)
    mmdeploy-0.10.0-windows-amd64-onnxruntime1.8.1.zip(184.75 MB)
  • v0.9.0(Sep 29, 2022)

    Features

    • Add Rust API for mmdeploy SDK. Project: https://github.com/liu-mengyang/rust-mmdeploy
    • Support MMOCR TextSnake and MMPose Hourglass model deployment with ncnn-int8 (#1074, #1064, #1066)
    • Rewrite torch.Tensor.__mod__ to support TensorRT (#1024)

    Improvements

    • Separate C++ API demos from C API demos (#1099)
    • Refactor SDK pipeline (#938)
    • Check upstream libopencv-dev version before adding apt repository (#1068)
    • Make inference still available on headless device (#1041)
    • Validate installation in building scripts (#1036)

    Bug fixes

    • Set size_divisor of Pad transform to 1 for static shape model. (#1049)
    • Fix LayerNorm shape issue when exporting to onnx with torch <= 1.12 (#1015)
    • Fix calibration error when converting model to TensorRT-int8 (#1050)
    • Synchronize cuda stream after inference with onnxruntime-gpu (#1053)
    • Add GatherTopk TensorRT plugin as a workaround to fix dynamic shape issue (#1033)
    • Fix RoiAlignFunction error for CoreML (#1029)
    • Resolve two-stage detector deployment error with CoreML (#1044)
    • Fix two-stage detector TensorRT deployment error with dynamic shape (#1046)

    Document

    • Update supported backends table in README (#1109)
    • Correct examples in tutorial - how to develop TensorRT plugin (#1021)
    • Fix broken links and typos (#1078, #1025, #1061)

    Contributors

    @liu-mengyang @BrokenArrow1404 @jinwonkim93 @Qingrenn @JingweiZhang12 @ichitaka @Typiqally @lvhan028 @irexyc @tpoisonooo @lzhangzz @grimoire @AllentDan @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.9.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(144.88 MB)
    mmdeploy-0.9.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(169.72 MB)
    mmdeploy-0.9.0-linux-x86_64-onnxruntime1.8.1.tar.gz(66.49 MB)
    mmdeploy-0.9.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(266.16 MB)
    mmdeploy-0.9.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(307.37 MB)
    mmdeploy-0.9.0-windows-amd64-onnxruntime1.8.1.zip(164.21 MB)
  • v0.8.0(Sep 7, 2022)

    Highlight

    • Support more platforms and devices: RISC-V, Apple M1, Huawei Ascend310 and Rockchip RK3588

    Features

    • Support more models on ONNX Runtime and TensorRT
      • mmdetection DETR (#924)
      • mmclassification Swin Transformer (#911)
      • mmdetection3d pointpillars (nus version) (#319)
    • Support more platforms and devices:
      • RISC-V via ncnn (#910)
      • Apple M1 (#760)
      • Huawei Ascend310 (#747)
      • Rockchip RK3588 (#865)
    • Add TorchScript SDK inference backend (#890)
    • Experimental support for fusing transformations in preprocess pipeline by CVFusion (#741)

    Improvements

    • Support multi-label classification in SDK (#950)

    • Add the following scripts to simplify mmdeploy installation for some scenarios: (#919)

      script | OS version -- | -- build_ubuntu_x64_ncnn.py | 18.04/20.04 build_ubuntu_x64_ort.py | 18.04/20.04 build_ubuntu_x64_pplnn.py | 18.04/20.04 build_ubuntu_x64_torchscript.py | 18.04/20.04

    • Add scaled dot-product attention operator for TensorRT (#949)

    • Support model batch inference profiling (#868)

    # profile the latency of resnet18-tensorrt model with batch size 4
    python tools/profiler.py \
        configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py \
        ../mmclassification/configs/resnet/resnet18_8xb32_in1k.py \
        {/the/path/of/an/image/directory} \
        --model {work-dirs}/mmcls/resnet/trt/end2end.engine \
        --device cuda \
        --shape 224x224 \
        --num-iter 100 \
        --warmup 10 \
        --batch-size 4
    

    Bug fixes

    • Fix CI errors (#985, #983, #977, #987, #966, #945)
    • Fix missing sqrt in PAAHead (#984)
    • Fix nms_rotated logic when no bbox is detected (#976)
    • Fix rewrite for torch.Tensor.__setitem__ in some corner cases (#964, #941)
    • Disable ONNX optimizer when converting model to ncnn (#961)
    • Fix regression test (#958)
    • Disable cublaslt for CUDA 10.2 (#947)
    • Stop sorting dataset by default & set test_mode for mmdet pipelines (#920)
    • Resolve the issue (#909) - ValueError: cpu is invalid for the backend tensorrt. when exporting SDK meta info (#912)
    • Validate the device id when the inference backend is TensorRT or OpenVINO (#886)
    • Fix mmdeploy_pplnn_net build error when target device is CPU (#896)
    • Replace adaptive_avg_pool2d with avg_pool2d to support exporting ONNX with dynamic shape (#857)

    Document

    Known issues

    • DETR deployment failed both via ONNX Runtime and TensorRT (#1011, pytorch 84563)

    Contributors

    @OldDreamInWind @liu-mengyang @gy-7 @Groexhy @munhou @miraclezqc @VVsssssk @hanrui1sensetime @tpoisonooo @grimoire @irexyc @RunningLeon @AllentDan @lzhangzz @lvhan028

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.8.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(142.46 MB)
    mmdeploy-0.8.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(167.29 MB)
    mmdeploy-0.8.0-linux-x86_64-onnxruntime1.8.1.tar.gz(64.18 MB)
    mmdeploy-0.8.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(127.41 MB)
    mmdeploy-0.8.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(194.16 MB)
    mmdeploy-0.8.0-windows-amd64-onnxruntime1.8.1.zip(16.19 MB)
  • v0.7.0(Aug 4, 2022)

    Highlight

    • Support SNPE (#789)
    • Add C++ API for SDK (#831)

    Features

    • Support SNPE (#789)
    • Add C++ API for SDK (#831)
    • Support MMRotate model with le135 angle format (#788)
    • Support RoI Transformer and Gliding Vertex model deployment from MMRotate (#713, #650)
    • Add inference latency test script tools/profile.py (#655) Here is an example to profile TensorRT_fp32-resnet18 inference latency:
    python tools/profile.py \
        configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py \
        ../mmclassification/configs/resnet/resnet18_8xb32_in1k.py \
        ../mmdetection/demo
        --model work-dirs/mmcls/resnet/trt/end2end.engine \
        --device cuda \
        --shape 224x224 \
        --num-iter 100 \
        --warmup 10
    

    Improvements

    • Optimize prebuilt process for Python SDK (#810)
    • Upgrade ppl.nn and ppl.cv to v0.8.1 and v0.7.0 respectively (#793, #564)
    • Support batch image test in test script test.py (#829)
    • Install onnx optimizer by setuptools instead of cmake build (#690, #811, #843)
    • Add SDK code coverage (#808)
    • Support kwargs in SDK Python bindings (#794, #844, #852)
    • Support building SDK into a single library by enabling MMDEPLOY_BUILD_SDK_MONOLITHIC (#806)
    • Add a new option MMDEPLOY_BUILD_EXAMPLES to build and install SDK examples (#822)
    • Reduce log verbosity and improve error reporting (#755)
    • Upgrade GPU Dockerfile to use TensorRT 8.2.4.2 (#706)
    • Optimize ONNX graph
      • Add a function rewriter to torch.Tensor.__setitem__, eliminating almost 80% nodes for x[:,:,:H,:W] = y onnx export (#704)
      • Add CommonSubgraphElimination onnx pass (#647)
    • [BC Breaking] Standardize C API(#634)
      • Prefix all struct with mmdeploy_ and move all header files into mmdeploy folder. image
    • Rename onnx2ncnn to mmdeploy_onnx2ncnn (#694)

    Bug fixes

    • Fix build error on macOS platform (#762)
    • Fix troch.triu function rewriter error when exporting to onnx (#792)
    • Resolve Cascade R-CNN, YOLOX and SATRN deployment failure (#787, #758, #753)
    • Fix check_env.py about checking whether custom ops are available (#785)
    • Fix export for TopK operator in PyTorch 1.12 (#715) Fix export for padding operators in PyTorch<1.10 (#754)
    • Add default topk in SDK model meta info when it is not explicitly specified in mmclassifcation model configs (#702)
    • Fix SingleRoIExtractor for TorchScript backend (#724)
    • Fix export for DistancePointBBoxCoder.decode (#687)
    • Fix wrong backend type when doing calibration (#719)
    • Set exit code to 1 when error happens (#715)
    • Fix build error on android platform (#698)
    • Pass img_metas while exporting to onnx (#681, #700, #707)

    Document

    • Update build document for android platform (#817)
    • Fix rendering issues of get_started documents in readthedocs (#740)
    • Add prebuilt package usage on Windows platform (#816)
    • Simplify get_started guide (#813)

    Contributors

    @nijkah @dwSun @lvhan028 @lzhangzz @irexyc @RunningLeon @grimoire @tpoisonooo @AllentDan @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.7.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(141.68 MB)
    mmdeploy-0.7.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(166.79 MB)
    mmdeploy-0.7.0-linux-x86_64-onnxruntime1.8.1.tar.gz(63.89 MB)
    mmdeploy-0.7.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(281.51 MB)
    mmdeploy-0.7.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(299.99 MB)
    mmdeploy-0.7.0-windows-amd64-onnxruntime1.8.1.zip(157.12 MB)
  • v0.6.0(Jun 30, 2022)

    Highlight

    • Support Swin Transformer deployment with TensorRT and ONNX Runtime (#652)
    • Support Segmenter deployment with all backends (#587)
    • Add Java API for SDK (#563)

    Features

    • Support Swin Transformer deployment with TensorRT and ONNX Runtime (#652)
    • Add Java API for SDK (#563)
    • Support Segmenter deployment with all backends (#587)
    • Support two-stage rotated detector deployment with TensorRT (#530)

    Improvements

    • Add onnx pass to fuse select-assign graph pattern (#589)
    • Add more CircleCI workflows on Linux, Windows and Linux-GPU platforms (#368)
    • Add documentation and sample code for model partitioning (#599)
    • Add GridPriorsTRT plugin to speed up TensorRT anchor generation from 155us t0 13us (#646)
    • Add MMDEPLOY_TASKS variable in cmake scripts to remove duplication code (#606)
    • Improve ncnn patch embed (#592)
    • Support compute capability 87 for Jetson Orin (#601)
    • Adjust csrc structure (#594)

    Bug fixes

    • Add build to TensorRT plugin candidate path list (#672)
    • Fix missing "image shape" when exporting mmpose models (#667)
    • Fix ncnn unittest error (#626)
    • Fix bugs when deploying ShuffleNetV2 with TensorRT (#645)
    • Relax mmcls version constraint (#653)
    • Eliminate illegal memory access for object detector C# API (#613)
    • Add dim param for Tensor::Squeeze (#603)
    • Fix link missed issue in index.rst (#607)
    • Add support for MMOCR 0.5+ (#604)
    • Fix output tensor shape of ncnn backend (#605)

    Documentation

    • Fix errors and typos in user documents (#676, #675, #655, #654, #621, #588, #586)
    • Update deployment benchmark for ViT (#624)
    • Replace markdown lint with mdformat and configure myst-parser (#610)

    Contributors

    @zambranohally @bgsuello @triple-Mu @DrRyanHuang @liuqc11 @Yosshi999 @zytx121 @RunningLeon @AllentDan @lzhangzz @irexyc @grimoire @lvhan028 @hanrui1sensetime @tpoisonooo

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.6.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(147.20 MB)
    mmdeploy-0.6.0-linux-x86_64-onnxruntime1.8.1.tar.gz(48.19 MB)
    mmdeploy-0.6.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(197.04 MB)
    mmdeploy-0.6.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(221.60 MB)
    mmdeploy-0.6.0-windows-amd64-onnxruntime1.8.1.zip(69.23 MB)
  • v0.5.0(Jun 9, 2022)

    Highlight

    • Provide prebuilt packages since v0.5.0
    • Decouple pytorch2onnx and onnx2backends
    • Support text detection models PANet, PSENet and DBNet, with CUDA accelerated postprocessing in SDK
    • Support MMRotate

    Features

    • Add prebuild tools (#545, #347)
    • Experimental executor support in SDK (#497)
    • Support ViT on ncnn (#477, #403)
    • Support LiteHRNet on ncnn (#316)
    • Support more text detection models PANet, PSENet and DBNet, with CUDA accelerated postprocessing in SDK (#446, #526, #534)
    • Add C# API for SDK (#388, #535)
    • Support ncnn quantization (#476)
    • Support RepPoints on TensorRT (#457)
    • Support MMRotate on ONNX Runtime and TensorRT (#277, #312, #422, #450, #428, #473)
    • Support MMRazor (#220, #467)

    Improvements

    • Remove spdlog manual installation but still keep it as an option (#423, #544) Users can turn on the following option to use the external spdlog
    cmake .. -DMMDEPLOY_SPDLOG_EXTERNAL=ON
    
    • Add SDK python demos (#554)
    • Add ONNX passes support (#390)
    • Decouple pytorch2onnx and onnx2backends (#529, #540)
    • Add scripts and configs to test metrics of deployed model with all inference backend (#425, #302, #551, #542)
    • Support MDCN and DeformConv TensorRT FP16 (#503, #468)
    • Add interactive build script for Linux and NVIDIA platform (#399)
    • Optimize global average pooling when exporting ONNX (#478)
    • Refactor onnx2ncnn, add test cases and simplify code (#436)
    • Remove expand operation from mmdet rewrite (#371)

    Bug fixes

    • Update CMake scripts to fix building problems (#544, #553)
    • Make ONNXRuntime wrapper work both for cpu and cuda execution (#438, #532)
    • Fix PSPNet-TorchScript conversion error (#538)
    • Resolve the incompatible issue when upgrading MMPose from v0.25.0 to v0.26.0 (#518, #527)
    • Fix mismatched device issue when testing Mask R-CNN deployed model (#511)
    • Remove redundant resize in mmseg EncoderDecoder rewrite (#480)
    • Fix display bugs on headless devices (#451)
    • Fix MMDet3D pillarencode deployment failure (#331)
    • Make the latest spdlog compatible (#423)
    • Fix CI (#462, #447, #440, #426, #441)
    • Fix a bug that causes exporting to onnx failed with static shape and batch size > 1 (#501)
    • Make --work-dir default to$pwd in tools/deploy.py (#483)

    Documentation

    • Fix user document errors, reorganize them, update REAME and rewrite the GET_STARTED chapters (#418, #482, #509, #531, #547, #543)
    • Rewrite the get_started for Jetson platforms (#484, #449, #415, #381)
    • Fix APIs rendering failure in readthedocs (#443)
    • Remove '' in API docstring (#495)
    • More tutorials in Chinese are checked in - Tutorial 05: ONNX Model Editing and Tutorial 04: onnx custom op (#508, #517)

    Contributors

    @sanjaypavo @PeterH0323 @tehkillerbee @zytx121 @triple-Mu @zhiqwang @gyf304 @lakshanthad @Dchaoqun @zhouzaida @NagatoYuki0943 @VVsssssk @irexyc @RunningLeon @hanrui1sensetime @lzhangzz @grimoire @tpoisonooo @AllentDan @SingleZombie

    Source code(tar.gz)
    Source code(zip)
    mmdeploy-0.5.0-linux-x86_64-cuda10.2-tensorrt8.2.3.0.tar.gz(95.71 MB)
    mmdeploy-0.5.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz(105.07 MB)
    mmdeploy-0.5.0-linux-x86_64-onnxruntime1.8.1.tar.gz(49.96 MB)
    mmdeploy-0.5.0-windows-amd64-cuda10.2-tensorrt8.2.3.0.zip(173.17 MB)
    mmdeploy-0.5.0-windows-amd64-cuda11.1-tensorrt8.2.3.0.zip(219.82 MB)
    mmdeploy-0.5.0-windows-amd64-onnxruntime1.8.1.zip(69.20 MB)
  • v0.4.1(Apr 29, 2022)

    Improvements

    • Add IPython notebook tutorial (#234)
    • Support detecting TensorRT from CUDA_TOOLKIT_ROOT_DIR (#357)
    • Build onnxruntime backend in GPU dockerfile (#366)
    • Add CircleCI workflow for linting (#348)
    • Support saving results when testing the deployed model of MMEdit (#336)
    • Support GPU postprocessing for instance segmentation (#276)

    Bug fixes

    • Make empty bounding box list allowed in text recognizer and pose detector C API (#310, #396)
    • Fix the logic of extracting model name from config (#394)
    • Fix feature test for std::source_location (#416)
    • Add missing codegen for sm_53 to support Jetson Nano (#407)
    • Fix crash caused by accessing the wrong tensor in segmentor C API (#363)
    • Fix reading mat type from the wrong image in a batch (#362)
    • Fix missing binary flag when saving temp OpenVINO model (#353)
    • Fix Windows build for pose demo (#307)

    Documents

    • Refine documents by fixing typos, correcting build commands, and removing redundant doc tree (#352, #360, #378, #398)
    • Add a tutorial about torch2onnx in Chinese (#365)

    Contributors

    @irexyc @VVsssssk @AllentDan @lzhangzz @PeterH0323 @RunningLeon @zly19540609 @triple-Mu @grimoire @hanrui1sensetime @SingleZombie @Adenialzz @tpoisonooo @lvhan028 @xizi

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Apr 1, 2022)

    Features

    • Support MMPose model inference in SDK: HRNet, LiteHRNet and MSPN
    • Support MMDetection3D: PointPillars and CenterPoint(pillar)
    • Support Andoid platform so as to benefit the development of android apps
    • Support fcn_unet deployment with dynamic shape
    • Support TorchScript

    Improvements

    • Optimize TRTMultiLevelRoiAlign plugin
    • Remove RoiAlign plugin for ONNXRuntime
    • Add DCN TensorRT plugin
    • Update pad logic in detection heads
    • Refactor the rewriter module of Model Converter
    • Suppress CMAKE_CUDA_ARCHITECTURES warnings
    • Update cmake scripts to ensure that the thirdparty packages are relocatable

    Bug fixes

    • Fix the crash on the headless installation
    • Correct the deployment configs for MMSegmentation
    • Optimize prepocess module and fix the potential use-after-free issue
    • Resolve the compatibility with torch 1.11
    • Fix the errors when deploying yolox model
    • Fix the errors occurred during docker build

    Documents

    • Reorganize the build documents. Add more details about how to build MMDeploy on Linx, Windows and Android platforms
    • Publish two chapters about the knowledge of model deployment
    • Update the supported model list, including MMSegmentation,MMPose and MMDetection3D
    • Translate the tutorial of "How to support new backends" into Chinese
    • Update the FAQ

    Contributors

    @irexyc @lvhan028 @RunningLeon @hanrui1sensetime @AllentDan @grimoire @lzhangzz @SemyonBevzuk @VVsssssk @SingleZombie @raykindle @yydc-0 @haofanwang @LJoson @PeterH0323

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Feb 28, 2022)

    Features

    • Support for windows platform.(#106)
    • Support mmpose codebase.(#94)
    • Support GFL model from mmdetection.(#124)
    • Support export hardsigmoid in torch<=1.8.(#169)

    Improvements

    • Support mmocr v0.4+.(#115)
    • Upgrade isort in pre-commit config.(#141)
    • Opimize delta2bboxes.(#152)

    Bug fixes

    • Fix onnxruntime wrapper for gpu inference. (#123)
    • Fix ci.(#144)
    • Fix tests for OpenVINO with python 3.6. (#125)
    • Added TensorRT version check. (#133)
    • Fix a type error when computing scale_factor in rewriting interpolate.(#185)

    Documents

    • Add Chinese documents How_to_support_new_model.md and How_to_write_config.md (#147,#137)

    Contributors

    A total of 19 developers contributed to this release.

    @grimoire @RunningLeon @AllentDan @lvhan028 @hhaAndroid @SingleZombie @lzhangzz @hanrui1sensetime @Vvsssssk @SemyonBevzuk @ypwhs @TheSeriousProgrammer @matrixgame2018 @tehkillerbee @uniyushu @haofanwang @ypwhs @zhouzaida @q3394101

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jan 28, 2022)

    Features

    • Support Nvidia Jetson deployment. (Nano, TX2, Xavier)
    • Add Python interface for SDK inference. (#27)
    • Support yolox on ncnn. (#29)
    • Support segmentation model UNet. (#77)
    • Add docker files. (#67)

    Improvements

    • Add coverage report, CI to GitHub repository. (#16, #34, #35)
    • Refactor the config utilities. (#12, #36)
    • Remove redundant copy operation when converting model. (#61)
    • Simplify single batch NMS. (#99)

    Documents

    • Now our English and Chinese documents are available on readthedocs: English 简体中文
    • Benchmark and tutorial for Nvidia Jetson Nano. (#71)
    • Fix docstring, links in documents. (#18, #32, #60, #84)
    • More documents for TensorRT and OpenVINO. (#96, #102)

    Bug fixes

    • Avoid outputting empty tensor in NMS for ONNX Runtime. (#42)
    • Fix TensorRT 7 SSD. (#49)
    • Fix mmseg dynamic shape. (#57)
    • Fix bugs about pplnn. (#40, #74)

    Contributors

    A total of 14 developers contributed to this release.

    @grimoire @RunningLeon @AllentDan @SemyonBevzuk @lvhan028 @hhaAndroid @Stephenfang51 @SingleZombie @lzhangzz @hanrui1sensetime @VVsssssk @zhiqwang @tehkillerbee @Echo-minn

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Dec 27, 2021)

    Major Features

    • Fully support OpenMMLab models

      We provide a unified model deployment toolbox for the codebases in OpenMMLab. The supported codebases are listed as below, and more will be added in the future

      • [x] MMClassification (== 0.19.0)
      • [x] MMDetection (== 2.19.0)
      • [x] MMSegmentation (== 0.19.0)
      • [x] MMEditing (== 0.11.0)
      • [x] MMOCR (== 0.3.0)
    • Multiple inference backends are available

      Models can be exported and run in different backends. The following ones are supported, and more will be taken into consideration

      • [x] ONNX Runtime (>= 1.8.0)
      • [x] TensorRT (>= 7.2)
      • [x] PPLNN (== 0.3.0)
      • [x] ncnn (== 20211208)
      • [x] OpenVINO (2021 4 LTS)
    • Efficient and highly scalable SDK Framework by C/C++

      All kinds of modules in SDK can be extensible, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on.

    Contributors

    A total of 11 developers contributed to this release.

    @grimoire @lvhan028 @AllentDan @VVsssssk @SemyonBevzuk @lzhangzz @RunningLeon @SingleZombie @del-zhenwu @zhouzaida @hanrui1sensetime

    Source code(tar.gz)
    Source code(zip)
Owner
OpenMMLab
OpenMMLab
code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

PyTorch implementation of UAGAN(U-net Attention Generative Adversarial Networks) This repository contains the source code for the paper "A High-precis

Tong 8 Apr 25, 2022
code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

39 Dec 17, 2022
BankNote-Net: Open dataset and encoder model for assistive currency recognition

BankNote-Net: Open Dataset for Assistive Currency Recognition Millions of people around the world have low or no vision. Assistive software applicatio

Microsoft 13 Oct 28, 2022
PyTorch experiments with the Zalando fashion-mnist dataset

zalando-pytorch PyTorch experiments with the Zalando fashion-mnist dataset Project Organization ├── LICENSE ├── Makefile - Makefile with co

Federico Baldassarre 31 Sep 25, 2021
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

Wearables Development Toolkit (WDK) The Wearables Development Toolkit (WDK) is a framework and set of tools to facilitate the iterative development of

Juan Haladjian 114 Nov 27, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 04, 2023
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the

Yuxiao Zhou 824 Jan 07, 2023
Sample and Computation Redistribution for Efficient Face Detection

Introduction SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv. Performance Precision, flops and infer ti

Sajjad Aemmi 13 Mar 05, 2022
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 03, 2022
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts

Yizhi Wang 146 Dec 18, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

Shoufa Chen 244 Dec 27, 2022
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

U^2-Net (U square net) Modified version of U2Net used for demonstation purposes. Paper: U^2-Net: Going Deeper with Nested U-Structure for Salient Obje

Shreyas Bhat Kera 13 Aug 28, 2022
This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.

A three-stage detection and recognition pipeline of complex meters in wild This is the first released system towards detection and recognition of comp

Yan Shu 19 Nov 28, 2022
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi

Li-Wei Chen 64 Dec 30, 2022
A curated list of resources for Image and Video Deblurring

A curated list of resources for Image and Video Deblurring

Subeesh Vasu 1.7k Jan 01, 2023
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
A symbolic-model-guided fuzzer for TLS

tlspuffin TLS Protocol Under FuzzINg A symbolic-model-guided fuzzer for TLS Master Thesis | Thesis Presentation | Documentation Disclaimer: The term "

69 Dec 20, 2022
coldcuts is an R package to automatically generate and plot segmentation drawings in R

coldcuts coldcuts is an R package that allows you to draw and plot automatically segmentations from 3D voxel arrays. The name is inspired by one of It

2 Sep 03, 2022