PyTorch ,ONNX and TensorRT implementation of YOLOv4

Overview

Pytorch-YOLOv4

A minimal PyTorch implementation of YOLOv4.

├── README.md
├── dataset.py            dataset
├── demo.py               demo to run pytorch --> tool/darknet2pytorch
├── demo_darknet2onnx.py  tool to convert into onnx --> tool/darknet2pytorch
├── demo_pytorch2onnx.py  tool to convert into onnx
├── models.py             model for pytorch
├── train.py              train models.py
├── cfg.py                cfg.py for train
├── cfg                   cfg --> darknet2pytorch
├── data            
├── weight                --> darknet2pytorch
├── tool
│   ├── camera.py           a demo camera
│   ├── coco_annotation.py       coco dataset generator
│   ├── config.py
│   ├── darknet2pytorch.py
│   ├── region_loss.py
│   ├── utils.py
│   └── yolo_layer.py

image

0. Weights Download

0.1 darknet

0.2 pytorch

you can use darknet2pytorch to convert it yourself, or download my converted model.

1. Train

use yolov4 to train your own data

  1. Download weight

  2. Transform data

    For coco dataset,you can use tool/coco_annotation.py.

    # train.txt
    image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
    image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
    ...
    ...
    
  3. Train

    you can set parameters in cfg.py.

     python train.py -g [GPU_ID] -dir [Dataset direction] ...
    

2. Inference

2.1 Performance on MS COCO dataset (using pretrained DarknetWeights from https://github.com/AlexeyAB/darknet)

ONNX and TensorRT models are converted from Pytorch (TianXiaomo): Pytorch->ONNX->TensorRT. See following sections for more details of conversions.

  • val2017 dataset (input size: 416x416)
Model type AP AP50 AP75 APS APM APL
DarkNet (YOLOv4 paper) 0.471 0.710 0.510 0.278 0.525 0.636
Pytorch (TianXiaomo) 0.466 0.704 0.505 0.267 0.524 0.629
TensorRT FP32 + BatchedNMSPlugin 0.472 0.708 0.511 0.273 0.530 0.637
TensorRT FP16 + BatchedNMSPlugin 0.472 0.708 0.511 0.273 0.530 0.636
  • testdev2017 dataset (input size: 416x416)
Model type AP AP50 AP75 APS APM APL
DarkNet (YOLOv4 paper) 0.412 0.628 0.443 0.204 0.444 0.560
Pytorch (TianXiaomo) 0.404 0.615 0.436 0.196 0.438 0.552
TensorRT FP32 + BatchedNMSPlugin 0.412 0.625 0.445 0.200 0.446 0.564
TensorRT FP16 + BatchedNMSPlugin 0.412 0.625 0.445 0.200 0.446 0.563

2.2 Image input size for inference

Image input size is NOT restricted in 320 * 320, 416 * 416, 512 * 512 and 608 * 608. You can adjust your input sizes for a different input ratio, for example: 320 * 608. Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.

height = 320 + 96 * n, n in {0, 1, 2, 3, ...}
width  = 320 + 96 * m, m in {0, 1, 2, 3, ...}

2.3 Different inference options

  • Load the pretrained darknet model and darknet weights to do the inference (image size is configured in cfg file already)

    python demo.py -cfgfile <cfgFile> -weightfile <weightFile> -imgfile <imgFile>
  • Load pytorch weights (pth file) to do the inference

    python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>
  • Load converted ONNX file to do inference (See section 3 and 4)

  • Load converted TensorRT engine file to do inference (See section 5)

2.4 Inference output

There are 2 inference outputs.

  • One is locations of bounding boxes, its shape is [batch, num_boxes, 1, 4] which represents x1, y1, x2, y2 of each bounding box.
  • The other one is scores of bounding boxes which is of shape [batch, num_boxes, num_classes] indicating scores of all classes for each bounding box.

Until now, still a small piece of post-processing including NMS is required. We are trying to minimize time and complexity of post-processing.

3. Darknet2ONNX

  • This script is to convert the official pretrained darknet model into ONNX

  • Pytorch version Recommended:

    • Pytorch 1.4.0 for TensorRT 7.0 and higher
    • Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
  • Install onnxruntime

    pip install onnxruntime
  • Run python script to generate ONNX model and run the demo

    python demo_darknet2onnx.py <cfgFile> <weightFile> <imageFile> <batchSize>

3.1 Dynamic or static batch size

  • Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
    • Dynamic batch size will generate only one ONNX model
    • Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)

4. Pytorch2ONNX

  • You can convert your trained pytorch model into ONNX using this script

  • Pytorch version Recommended:

    • Pytorch 1.4.0 for TensorRT 7.0 and higher
    • Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
  • Install onnxruntime

    pip install onnxruntime
  • Run python script to generate ONNX model and run the demo

    python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <IN_IMAGE_H> <IN_IMAGE_W>

    For example:

    python demo_pytorch2onnx.py yolov4.pth dog.jpg 8 80 416 416

4.1 Dynamic or static batch size

  • Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
    • Dynamic batch size will generate only one ONNX model
    • Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)

5. ONNX2TensorRT

  • TensorRT version Recommended: 7.0, 7.1

5.1 Convert from ONNX of static Batch size

  • Run the following command to convert YOLOv4 ONNX model into TensorRT engine

    trtexec --onnx=<onnx_file> --explicitBatch --saveEngine=<tensorRT_engine_file> --workspace=<size_in_megabytes> --fp16
    • Note: If you want to use int8 mode in conversion, extra int8 calibration is needed.

5.2 Convert from ONNX of dynamic Batch size

  • Run the following command to convert YOLOv4 ONNX model into TensorRT engine

    trtexec --onnx=<onnx_file> \
    --minShapes=input:<shape_of_min_batch> --optShapes=input:<shape_of_opt_batch> --maxShapes=input:<shape_of_max_batch> \
    --workspace=<size_in_megabytes> --saveEngine=<engine_file> --fp16
  • For example:

    trtexec --onnx=yolov4_-1_3_320_512_dynamic.onnx \
    --minShapes=input:1x3x320x512 --optShapes=input:4x3x320x512 --maxShapes=input:8x3x320x512 \
    --workspace=2048 --saveEngine=yolov4_-1_3_320_512_dynamic.engine --fp16

5.3 Run the demo

python demo_trt.py <tensorRT_engine_file> <input_image> <input_H> <input_W>
  • This demo here only works when batchSize is dynamic (1 should be within dynamic range) or batchSize=1, but you can update this demo a little for other dynamic or static batch sizes.

  • Note1: input_H and input_W should agree with the input size in the original ONNX file.

  • Note2: extra NMS operations are needed for the tensorRT output. This demo uses python NMS code from tool/utils.py.

6. ONNX2Tensorflow

7. ONNX2TensorRT and DeepStream Inference

  1. Compile the DeepStream Nvinfer Plugin
    cd DeepStream
    make 
  1. Build a TRT Engine.

For single batch,

trtexec --onnx= --explicitBatch --saveEngine= --workspace= --fp16

For multi-batch,

trtexec --onnx= --explicitBatch --shapes=input:Xx3xHxW --optShapes=input:Xx3xHxW --maxShapes=input:Xx3xHxW --minShape=input:1x3xHxW --saveEngine= --fp16

Note :The maxShapes could not be larger than model original shape.

  1. Write the deepstream config file for the TRT Engine.

Reference:

@article{yolov4,
  title={YOLOv4: YOLOv4: Optimal Speed and Accuracy of Object Detection},
  author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  journal = {arXiv},
  year={2020}
}
Owner
DL CV OCR and algorithm optimization
Reinforcement learning for self-driving in a 3D simulation

SelfDrive_AI Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D) 1. Requirements for the SelfDrive_AI Gym You need Pyt

Surajit Saikia 17 Dec 14, 2021
Multiview 3D object detection on MultiviewC dataset through moft3d.

Multiview Orthographic Feature Transformation for 3D Object Detection Multiview 3D object detection on MultiviewC dataset through moft3d. Introduction

Jiahao Ma 20 Dec 21, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

GemNet: Universal Directional Graph Neural Networks for Molecules Reference implementation in PyTorch of the geometric message passing neural network

Data Analytics and Machine Learning Group 124 Dec 30, 2022
An executor that loads ONNX models and embeds documents using the ONNX runtime.

ONNXEncoder An executor that loads ONNX models and embeds documents using the ONNX runtime. Usage via Docker image (recommended) from jina import Flow

Jina AI 2 Mar 15, 2022
3D Avatar Lip Syncronization from speech (JALI based face-rigging)

visemenet-inference Inference Demo of "VisemeNet-tensorflow" VisemeNet is an audio-driven animator centric speech animation driving a JALI or standard

Junhwan Jang 17 Dec 20, 2022
Code for Deep Single-image Portrait Image Relighting

Deep Single-Image Portrait Relighting [Project Page] Hao Zhou, Sunil Hadap, Kalyan Sunkavalli, David W. Jacobs. In ICCV, 2019 Overview Test script for

438 Jan 05, 2023
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
This application explain how we can easily integrate Deepface framework with Python Django application

deepface_suite This application explain how we can easily integrate Deepface framework with Python Django application install redis cache install requ

Mohamed Naji Aboo 3 Apr 18, 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

DSEE Codes for [Preprint] DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Ch

VITA 4 Dec 27, 2021
Alleviating Over-segmentation Errors by Detecting Action Boundaries

Alleviating Over-segmentation Errors by Detecting Action Boundaries Forked from ASRF offical code. This repo is the a implementation of replacing orig

13 Dec 12, 2022
NeWT: Natural World Tasks

NeWT: Natural World Tasks This repository contains resources for working with the NeWT dataset. ❗ At this time the binary tasks are not publicly avail

Visipedia 26 Oct 18, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
yolov5目标检测模型的知识蒸馏(基于响应的蒸馏)

代码地址: https://github.com/Sharpiless/yolov5-knowledge-distillation 教师模型: python train.py --weights weights/yolov5m.pt \ --cfg models/yolov5m.ya

52 Dec 04, 2022
A Haskell kernel for IPython.

IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur

Andrew Gibiansky 2.4k Dec 29, 2022
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023
A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

self-driving-car In this repository I will share the source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree. Hope this might

Andrea Palazzi 2.4k Dec 29, 2022