利用yolov5和TensorRT从0到1实现目标检测的模型训练到模型部署全过程

Overview

写在前面

利用TensorRT加速推理速度是以时间换取精度的做法,意味着在推理速度上升的同时将会有精度的下降,不过不用太担心,精度下降微乎其微。此外,要有NVIDIA显卡,经测试,CUDA10.2可以支持20系列显卡及以下,30系列显卡需要CUDA11.x的支持,并且目前有bug。

默认你已经完成了 yolov5的训练过程并得到了.pt模型权值文件。

本文目的仅是带着走通流程。

注意要对应yolov5和tensorrtx的版本。

  • ./yolov5包含yolov5训练以及模型初转化阶段的代码
  • ./model_process是将.wts模型转化为.engine模型的代码
  • ./detector是利用.engine模型进行前向推理阶段的代码

我的运行环境(注意OpenCV要选择适合你的visual studio的版本等问题):

win10

Visual Studio 2019

NVIDIA GeForce RTX 2060

opencv-3.4.3-vc14_vc15

cuda_10.2.89_441.22_win10

cudnn-10.2-windows10-x64-v7.6.5.32

TensorRT-7.0.0.11.Windows10.x86_64.cuda-10.2.cudnn7.6

cmake-3.21.2-windows-x86_64

上述环境的百度云(测试10、20系列可用):

链接:https://pan.baidu.com/s/1AyaloTzLap8X2hsJBvyeBw
提取码:dwr7

其他版本下载地址:

CUDA cudnn TensorRT CMake OpenCV

环境安装:

1、安装OpenCV并配置好环境变量

2、安装CUDA

一路默认。一般的安装路径为:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2

3、安装cudnn和TensorRT

cudnn和TensorRT的安装仅是将下载的对应版本的压缩包解压并复制*.h、*.lib、*.dll到CUDA的安装路径。

1 将cuDNN压缩包解压

2 将cuda\bin中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin

3 将cuda\include中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include

4 将cuda\lib中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib

另外,

1 将TensorRT压缩包解压

2 将 TensorRT-7.0.0.11\include中头文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include

3 将TensorRT-7.0.0.11\lib中所有lib文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64

4 将TensorRT-7.0.0.11\lib中所有dll文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin

4、安装CMake软件备用

一、将训练阶段得到的.pt模型转化为.wts中间模型

把tensorrtx里面的yolov5\gen_wts.py加入到yolov5里面,执行

python gen_wts.py -w [.pt权值文件路径] 

runs\train\exp\weights\best.pt为训练过程生成的.pt模型,生成的best.wts会保存到同目录下,此best.wts待会会用到。

cuda版本每个电脑不一样

配置好的tensorrtx,包括Cmakelist.txt的设定以及dirent.h的配置。

若使用原作者的请参照tensorrtx源码https://github.com/wang-xinyu/tensorrtx ,配置过程中会遇到一些问题,挨个解决,问题不大。

1、在yolov5目录下新建build文件夹

2、修改CMakelist.txt

add_definitions(-DAPI_EXPORTS)

3、打开CMake

​​ generate后关闭

4、yolov5/include/dirent.h

​​ 也可使用我的配置好的

二、利用Cmake软件创建VS工程

修改CMakeLists.txt中此处为你的opencv安装路径。

配置好上方两个目录之后,点击Configure,根据你的环境选择配置,

点击Gnerate,警告可忽视,

现在关闭Cmake即可。

三、wts转化为engine

VS打开刚刚在bulid目录下创建的工程。

build处vs打开,生成

问题:我的模型只识别一个类,需要更改


cd {tensorrtx}/yolov5/

// update CLASS_NUM in yololayer.h if your model is trained on custom dataset

为1

生成项目。

把之前生成的best.wts复制到build\release目录里面

cmd里面运行:

.\test.exe -s .\best.wts best.engine s

运行成功在同文件夹下面会得到best.engine转换后的文件。之后的推理过程使用的都是这个文件。

测试:

.\yolov5.exe -d best.engine .\samples

至此,流程走完。

如果想要进一步封装,可以按照我的示例。

注释掉yolov5.cpp,并取消 几个文件的注释。重新生成项目。按照你的需求更改。

Owner
Helium
Helium
Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

Madry Lab 35 Nov 14, 2022
This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

ISL This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation, which is accepted

19 May 04, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

65 Dec 22, 2022
Forecasting with Gradient Boosted Time Series Decomposition

ThymeBoost ThymeBoost combines time series decomposition with gradient boosting to provide a flexible mix-and-match time series framework for spicy fo

131 Jan 08, 2023
WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

HeadPoseEstimation-WHENet-yolov4-onnx-openvino ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L 1. Usage $ git clone htt

Katsuya Hyodo 49 Sep 21, 2022
A minimal implementation of Gaussian process regression in PyTorch

pytorch-minimal-gaussian-process In search of truth, simplicity is needed. There exist heavy-weighted libraries, but as you know, we need to go bare b

Sangwoong Yoon 38 Nov 25, 2022
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo

Jialian Wu 54 Jan 06, 2023
Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

Rithesh Kumar 16 Oct 06, 2022
OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 211 Jan 02, 2023
This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

Timo Schick 62 Dec 12, 2022
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022
LyaNet: A Lyapunov Framework for Training Neural ODEs

LyaNet: A Lyapunov Framework for Training Neural ODEs Provide the model type--config-name to train and test models configured as those shown in the pa

Ivan Dario Jimenez Rodriguez 21 Nov 21, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

The Pytorch Implementation of Category-consistent deep network learning for accurate vehicle logo recognition This is the offical website for paper ''

Wanglong Lu 28 Oct 29, 2022
Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Streamlit Tutorials Install pip install streamlit Run cd [directory] streamlit run app.py --server.address 0.0.0.0 --server.port [your port] # http:/

Jihye Back 30 Jan 06, 2023
CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery

CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery This paper (CoANet) has been published in IEEE TIP 2021. This code i

Jie Mei 53 Dec 03, 2022
Count GitHub Stars ⭐

Count GitHub Stars per Day ⭐ Track GitHub stars per day over a date range to measure the open-source popularity of different repositories. Requirement

Ultralytics 20 Nov 20, 2022
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

Rao Muhammad Umer 6 Nov 14, 2022
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"

yoyo-nb 1.4k Dec 30, 2022
Code for project: "Learning to Minimize Remainder in Supervised Learning".

Learning to Minimize Remainder in Supervised Learning Code for project: "Learning to Minimize Remainder in Supervised Learning". Requirements and Envi

Yan Luo 0 Jul 18, 2021