YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Last update: Jan 06, 2023

Related tags

Overview

YOLTv4

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4. YOLTv4 is designed to detect objects in aerial or satellite imagery in arbitrarily large images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

This repository is built upon the impressive work of AlexeyAB's YOLOv4 implementation, which improves both speed and detection performance compared to YOLOv3 (which is implemented in SIMRDWN). We use YOLOv4 insead of "YOLOv5", since YOLOv4 is endorsed by the original creators of YOLO, whereas "YOLOv5" is not; furthermore YOLOv4 appears to have superior performance.

Below, we provide examples of how to use this repository with the open-source Rareplanes dataset.

Running YOLTv4

0. Installation

YOLTv4 is built to execute within a docker container on a GPU-enabled machine. The docker command creates an Ubuntu 16.04 image with CUDA 9.2, python 3.6, and conda.

Clone this repository (e.g. to /yoltv4/).
Download model weights to yoltv4/darknet/weights). See: https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137 https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-csp.conv.142
Install nvidia-docker.

Build docker file.

 nvidia-docker build -t yoltv4_image /yoltv4/docker

Spin up the docker container (see the docker docs for options).

 NV_GPU=0 nvidia-docker run -it -v /local_data:/local_data -v /yoltv4:/yoltv4 -ti --ipc=host --name yoltv4_gpu0 yoltv4_image

Compile the Darknet C program.

First Set GPU=1 CUDNN=1, CUDNN_HALF=1, OPENCV=1 in /yoltv4/darknet/Makefile, then make:
```
 cd /yoltv4/darknet
 make
```

1. Train

A. Prepare Data

Make YOLO images and labels (see yoltv4/notebooks/train_test_pipeline.ipynb for further details).
Create a txt file listing the training images.
Create file obj.names file with each desired object name on its own line.

Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:

/yoltv4/darknet/data/rareplanes_train.data

 classes = 30
 train =  /local_data/cosmiq/wdata/rareplanes/train/txt/train.txt
 valid =  /local_data/cosmiq/wdata/rareplanes/train/txt/valid.txt
 names =  /yoltv4/darknet/data/rareplanes.name
 backup = backup/

Prepare config files.

See instructions here, or tweak /yoltv4/darknet/cfg/yoltv4_rareplanes.cfg.

B. Execute Training

Execute.

 cd /yoltv4/darknet
 time ./darknet detector train data/rareplanes_train.data  cfg/yoltv4_rareplanes.cfg weights/yolov4.conv.137  -dont_show -mjpeg_port 8090 -map

Review progress (plotted at: /yoltv4/darknet/chart_yoltv4_rareplanes.png).

2. Test

A. Prepare Data

Make sliced images (see yoltv4/notebooks/train_test_pipeline.ipynb for further details).
Create a txt file listing the training images.
Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:

/yoltv4/darknet/data/rareplanes_test.data classes = 30 train = valid = /local_data/cosmiq/wdata/rareplanes/test/txt/test.txt names = /yoltv4/darknet/data/rareplanes.name backup = backup/

B. Execute Testing

Execute (proceeds at >80 frames per second on a Tesla P100):

 cd /yoltv4/darknet
 time ./darknet detector valid data/rareplanes_test.data cfg/yoltv4_rareplanes.cfg backup/ yoltv4_rareplanes_best.weights

Post-process detections:

A. Move detections into results directory

 mkdir /yoltv4/darknet/results/rareplanes_preds_v0
 mkdir  /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt
 mv /yoltv4/darknet/results/comp4_det_test_*  /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/

B. Stitch detections back together and make plots

 time python /yoltv4/yoltv4/post_process.py \
     --pred_dir=/yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/ \
     --raw_im_dir=/local_data/cosmiq/wdata/rareplanes/test/images/ \
     --sliced_im_dir=/local_data/cosmiq/wdata/rareplanes/test/yoltv4/images_slice/ \
     --out_dir= /yoltv4/darknet/results/rareplanes_preds_v0 \
     --detection_thresh=0.25 \
     --slice_size=416} \
     --n_plots=8

Outputs will look something like the figures below:

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Related tags

Overview

YOLTv4

Running YOLTv4

0. Installation

1. Train

A. Prepare Data

B. Execute Training

2. Test

A. Prepare Data

B. Execute Testing

Owner

Adam Van Etten

The official PyTorch code implementation of "Personalized Trajectory Prediction via Distribution Discrimination" in ICCV 2021.

[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

DIVeR: Deterministic Integration for Volume Rendering

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Machine learning for NeuroImaging in Python

The code for two papers: Feedback Transformer and Expire-Span.

Implementation of Nalbach et al. 2017 paper.

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

Differential rendering based motion capture blender project.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy

Geometry-Free View Synthesis: Transformers and no 3D Priors

Tree Nested PyTorch Tensor Lib

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

StarGAN - Official PyTorch Implementation (CVPR 2018)

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

A Simple and Versatile Framework for Object Detection and Instance Recognition