YOLOX Win10 Project

Overview

Introduction

这是一个用于Windows训练YOLOX的项目,相比于官方项目,做了一些适配和修改:

1、解决了Windows下import yolox失败,No such file or directory: 'xxx.xml'等路径问题

2、CUDA out of memory等显存不够问题

3、增加eval.txt,可以输出IoU=0.5-0.95的AP值,以及Map50和Map50:95

Benchmark

Model size mAPval
0.5:0.95
mAPtest
0.5:0.95
Speed V100
(ms)
Params
(M)
FLOPs
(G)
weights
YOLOX-s 640 40.5 40.5 9.8 9.0 26.8 github
YOLOX-m 640 46.9 47.2 12.3 25.3 73.8 github
YOLOX-l 640 49.7 50.1 14.5 54.2 155.6 github
YOLOX-x 640 51.1 51.5 17.3 99.1 281.9 github
YOLOX-Darknet53 640 47.7 48.0 11.1 63.7 185.3 github

Training on custom data

1、准备数据集

以VOC数据集为例,数据目录如下图所示,datasets/VOCdevkit/VOC2021/(不建议修改年份,如需要修改,则对应修改yolox_voc_s.py中的年份),该文件夹下有三个文件夹,分别为Annotations、JPEGImages、ImageSets,特别注意ImageSets文件夹下须新建Main文件夹,运行dataset_cls.py(注意切换到datasets路径下,可以修改训练集和测试集比例)会自动生成训练文件trainval.txttest.txt

2、修改配置文件

修改exps/example/yolox_voc/yolox_voc_s.py文件 self.num_classes和其他配置变量(自选)

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.num_classes = 42         #修改成自己的类别
        self.depth = 0.33
        self.width = 0.50
        self.warmup_epochs = 1

此Exp类体继承MyExp类体,且可以对MyExp的变量重写(因此有更高的优先级),对按住ctrl点击MyExp跳转

class Exp(BaseExp):
    def __init__(self):
        super().__init__()

        # ---------------- model config ---------------- #
        self.num_classes = 80  #因为在yolox_voc_s.py中已经重新赋值,此处不用修改
        self.depth = 1.00
        self.width = 1.00
        self.act = 'silu'

        # ---------------- dataloader config ---------------- #
        # set worker to 4 for shorter dataloader init time
        self.data_num_workers = 1
        self.input_size = (640, 640)  # (height, width)
        # Actual multiscale ranges: [640-5*32, 640+5*32].
        # To disable multiscale training, set the
        # self.multiscale_range to 0.
        self.multiscale_range = 5 #五种输入大小随机调整
        # You can uncomment this line to specify a multiscale range
        # self.random_size = (14, 26)
        self.data_dir = None
        self.train_ann = "instances_train2017.json"
        self.val_ann = "instances_val2017.json"

        # --------------- transform config ----------------- #
        self.mosaic_prob = 1.0   #数据增强概率,可以根据需要调整
        self.mixup_prob = 1.0
        self.hsv_prob = 1.0
        self.flip_prob = 0.5
        self.degrees = 10.0
        self.translate = 0.1
        self.mosaic_scale = (0.1, 2)
        self.mixup_scale = (0.5, 1.5)
        self.shear = 2.0
        self.enable_mixup = True

        # --------------  training config --------------------- #
        self.warmup_epochs = 5
        self.max_epoch = 100  #设置训练轮数
        self.warmup_lr = 0
        self.basic_lr_per_img = 0.01 / 64.0
        self.scheduler = "yoloxwarmcos"
        self.no_aug_epochs = 15 #不适用数据增强轮数
        self.min_lr_ratio = 0.05
        self.ema = True

        self.weight_decay = 5e-4
        self.momentum = 0.9
        self.print_interval = 10 #每隔十步打印输出一次训练信息
        self.eval_interval = 1 #每隔1轮保存一次
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

        # -----------------  testing config ------------------ #
        self.test_size = (640, 640)
        self.test_conf = 0.01
        self.nmsthre = 0.65

可以对上述类体变量进行调整,其中关键变量有input_size、max_epoch、eval_interval

3、开始训练

输入以下命令开始训练,-c 表示加载预训练权重

python tools/train.py  -c /path/to/yolox_s.pth

你也可以对其他参数进行调整,例如:

python tools/train.py  -d 1 -b 8 --fp16 -c /path/to/yolox_s.pth

-d 表示用几块显卡,-b 表示设置batch_size,--fp16 表示半精度训练,-c 表示加载预训练权重,如果在显存不足的情况下,谨慎输入 -o 参数,会占用较多显存

如果训练一半终止后,想继续断点训练,可以输入

python tools/train.py --resume

Evaluation

输入以下代码默认对精度最高模型评估,评估后,可以在YOLOX_outputs/yolox_voc_s/eval.txt中看到IoU=0.5-0.95的AP值,文件最后可以看到Map50Map50:95

python tools/eval.py

如需对设定其他参数,可以输入以下代码,参数意义同训练

python tools/eval.py -n  yolox-s -c yolox_s.pth -b 8 -d 1 --conf 0.001 
                         yolox-m
                         yolox-l
                         yolox-x

Reference

https://github.com/Megvii-BaseDetection/YOLOX

Exponential Graph is Provably Efficient for Decentralized Deep Training

Exponential Graph is Provably Efficient for Decentralized Deep Training This code repository is for the paper Exponential Graph is Provably Efficient

3 Apr 20, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples

Welcome to the cuQuantum repository! This public repository contains two sets of files related to the NVIDIA cuQuantum SDK: samples: All C/C++ sample

NVIDIA Corporation 147 Dec 27, 2022
Artificial Intelligence playing minesweeper 🤖

AI playing Minesweeper ✨ Minesweeper is a single-player puzzle video game. The objective of the game is to clear a rectangular board containing hidden

Vaibhaw 8 Oct 17, 2022
PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

MIRCO PyTorch implementation for paper: Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation Dependencies Python 3.

Big Data and Multi-modal Computing Group, CRIPAC 9 Dec 08, 2022
PyTorch implementation for View-Guided Point Cloud Completion

PyTorch implementation for View-Guided Point Cloud Completion

22 Jan 04, 2023
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

Kaichun Mo 26 Nov 04, 2022
Implements a fake news detection program using classifiers.

Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz

Apostolos Karvelas 1 Jan 09, 2022
Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y

Hsiao-Yu Fish Tung 18 Dec 19, 2022
A PyTorch implementation of deep-learning-based registration

DiffuseMorph Implementation A PyTorch implementation of deep-learning-based registration. Requirements OS : Ubuntu / Windows Python 3.6 PyTorch 1.4.0

24 Jan 03, 2023
Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

MyTT Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! to Stock Market Financial Technical Analysis Python

dev 34 Dec 27, 2022
QICK: Quantum Instrumentation Control Kit

QICK: Quantum Instrumentation Control Kit The QICK is a kit of firmware and software to use the Xilinx RFSoC to control quantum systems. It consists o

81 Dec 15, 2022
codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022
Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

Third Time's the Charm? Image and Video Editing with StyleGAN3 Yuval Alaluf*, Or Patashnik*, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Da

531 Dec 20, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 09, 2023
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 833 Dec 28, 2022
Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet)

Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet) By Lele Chen , Ross K Maddox, Zhiyao Duan, Chenliang Xu. Unive

Lele Chen 218 Dec 27, 2022
PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering Jang Hyun Cho1, Utkarsh Mall2, Kavita Bala2, Bharath Harihar

Jang Hyun Cho 164 Dec 30, 2022
[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition (CVPR 2021) arXiv Prerequisite PyTorch = 1.2.0 Python3 torchvision PIL argpar

51 Nov 11, 2022
Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

DiagonalGAN Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Trans

32 Dec 06, 2022