This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

Overview

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Skin Lesion detection using YOLO

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

Predictions

YOLOv3

YOLO stands for "You Only Look Once" which uses Convolutional Neural Networks for Object Detection. On a single image, YOLO may detect multiple objects. It implies that, in addition to predicting object classes, YOLO also recognises its positions in the image. The entire image is processed by a single Neural Network in YOLO. The picture is divided into regions using this Neural Network, which generates probabilities for each region. YOLO predicts multiple bounding boxes that cover some regions of the image and then based on the probabilities, picks the best one.

Architecture of YOLOv3:

Architecture of YOLOv3

  • YOLOv3 has a total of 106 layers where detections are made at 82, 94 and 106 layers.
  • It consists of a residual blocks, skip connections and up-sampling.
  • Each convolutional layer is followed by batch normalization layer and Leaky ReLU activation function.
  • There are no pooling layers, but instead, additional convolutional layers with stride 2, are used to down-sample feature maps.

Input:

Input images themselves can be of any size, there is no need to resize them before feeding to the network. However, all the images must be stored in a single folder. In the same folder, there should be a text file, one for each image(with the same file name), containing the "true" annotations of the bounding box in YOLOv3 format i.e.,


    
     
      
       
        
       
      
     
    
   

where,

  • class id = label index of the class to be annotated
  • Xo = X coordinate of the bounding box’s centre
  • Yo = Y coordinate of the bounding box’s centre
  • W = Width of the bounding box
  • H = Height of the bounding box
  • X = Width of the image
  • Y = Height of the image

For multiple objects in the same image, this annotation is saved line-by-line for each object.

Steps:

Note: I have used Google Colab which supports Linux commands. The steps for running it on local windows computer is different.

  1. Create "true" annotations using Annotate_YOLO.py which takes in segmented(binary) images as input and returns text file for every image, labelled in the YOLO format.

  2. Clone darknet from AlexeyAB's GitHub repository, adjust the Makefile to enable OPENCV and GPU and then build darknet.

    # Clone darknet repo
    !git clone https://github.com/AlexeyAB/darknet
    
    # Change makefile to have GPU and OPENCV enabled
    %cd darknet
    !chmod +x ./darknet
    !sed -i 's/OPENCV=0/OPENCV=1/' Makefile
    !sed -i 's/GPU=0/GPU=1/' Makefile
    !sed -i 's/CUDNN=0/CUDNN=1/' Makefile
    !sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
    
    # To use the darknet executable file
    !make
  3. Download the pre-trained YOLO weights from darknet. It is trained on a coco dataset consisting of 80 classes.

    !wget https://pjreddie.com/media/files/darknet53.conv.74
    
  4. Define the helper function as in Helper.py that is used to display images.

  5. Split the dataset into train, validation and test set (including the labels) and store it in darknet/data folder with filenames "obj", "valid", and "test" repectively. In my case, total images = 2594 out of which,

    Train = 2094 images; Validation = 488 images; Test = 12 images.

  6. Create obj.names consisting of class names (one class name per line) and also create obj.data that points to the file paths of train data, validation data and backup folder which will store the weights of the model trained on our custom dataset.

  7. Tune hyper-parameters by creating custom config file in "cfg" folder which is inside "darknet" folder. Change the following parameters in yolov3.clf and save it as yolov3-custom.cfg:

    The parameters are chosen by considering the following:

    - max_batches = (# of classes) * 2000 --> [but no less than 4000]
    
    - steps = (80% of max_batches), (90% of max_batches)
    
    - filters = (# of classes + 5) * 3
    
    - random = 1 to random = 0 --> [to speed up training but slightly reduce accuracy]
    

    I chose the following:

    • Training batch = 64
    • Training subdivisions = 16
    • max_batches = 4000, steps = 3200, 3600
    • classes = 1 in the three YOLO layers
    • filters = 18 in the three convolutional layers just before the YOLO layers.
  8. Create "train.txt" and "test.txt" using Generate_Train_Test.py

  9. Train the Custom Object Detector

    !./darknet detector train data/obj.data cfg/yolov3-custom.cfg darknet53.conv.74 -dont_show -map
    
    # Show the graph to review the performance of the custom object detector
    imShow('chart.png')

    The new weights will be stored in backup folder with the name yolov3-custom_best.weights after training.

  10. Check the Mean Average Precision(mAP) of the model

    !./darknet detector map data/obj.data cfg/yolov3-custom.cfg /content/drive/MyDrive/darknet/backup/yolov3-custom_best.weights
  11. Run Your Custom Object Detector

    For testing, set batch and subdivisions to 1.

    # To set our custom cfg to test mode
    %cd cfg
    !sed -i 's/batch=64/batch=1/' yolov3-custom.cfg
    !sed -i 's/subdivisions=16/subdivisions=1/' yolov3-custom.cfg
    %cd ..
    
    # To run custom detector
    # thresh flag sets threshold probability for detection
    !./darknet detector test data/obj.data cfg/yolov3-custom.cfg /content/drive/MyDrive/darknet/backup/yolov3-custom_best.weights /content/drive/MyDrive/Test_Lesion/ISIC_0000000.jpg -thresh 0.3
    
    # Show the predicted image with bounding box and its probability
    imShow('predictions.jpg')

Conclusion

The whole process looks like this:

Summary

Chart (Loss in mAP vs Iteration number):

Loss Chart

Loss Capture

The model was supposed to take 4,000 iterations to complete, however, the rate of decrease in loss is not very significant after 1000 iterations. The model is performing with similar precision even with 3000 fewer iterations which resulted in low training time (saving over 9 hours of compute time) and also there is less chance of it overfitting the data. Hence, the model was stopped pre-maturely just after 1100 iterations.

From the above chart, we can see that the average loss is 0.2545 and the mean average precision(mAP) is over 95% which is extremely good.

Assumption and dependencies

The user is assumed to have access to the ISICs dataset with colored images required for training, as well as its corresponding binary segmentation images.

Dependencies:

  • Google Colab Notebook
  • Python 3.7 on local machine
  • Python libraries: matplotlib, OpenCV2, glob
  • Darknet which is an open source neural network framework written in C and CUDA

References

Redmon, J., & Farhadi, A. (2018). YOLO: Real-Time Object Detection. Retrieved October 28, 2021, from https://pjreddie.com/darknet/yolo/

About the Author

Lalith Veerabhadrappa Badiger
The University of Queensland, Brisbane, Australia
Master of Data Science
Student ID: 46557829
Email ID: [email protected]

Owner
Lalith Veerabhadrappa Badiger
Master of Data Science
Lalith Veerabhadrappa Badiger
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

Edresson Casanova 390 Dec 29, 2022
Tackling Obstacle Tower Challenge using PPO & A2C combined with ICM.

Obstacle Tower Challenge using Deep Reinforcement Learning Unity Obstacle Tower is a challenging realistic 3D, third person perspective and procedural

Zhuoyu Feng 5 Feb 10, 2022
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

71 Nov 29, 2022
Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

Lim Swee Kiat 11 Jun 23, 2022
Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

Yujun Shi 24 Nov 01, 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

172 Dec 23, 2022
MoveNet Single Pose on OpenVINO

MoveNet Single Pose tracking on OpenVINO Running Google MoveNet Single Pose models on OpenVINO. A convolutional neural network model that runs on RGB

35 Nov 11, 2022
Official Implementation of VAT

Semantic correspondence Few-shot segmentation Cost Aggregation Is All You Need for Few-Shot Segmentation For more information, check out project [Proj

Hamacojr 114 Dec 27, 2022
The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

The Ludii General Game System Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). This repository h

Digital Ludeme Project 50 Jan 04, 2023
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
GPU implementation of $k$-Nearest Neighbors and Shared-Nearest Neighbors

GPU implementation of kNN and SNN GPU implementation of $k$-Nearest Neighbors and Shared-Nearest Neighbors Supported by numba cuda and faiss library E

Hyeon Jeon 7 Nov 23, 2022
This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

gordon 9 Nov 29, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p

ESPnet 5.9k Jan 04, 2023
Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

FGHV Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing Requirements Python 3.6 Pytorch 1.5.0 Cud

5 Jun 02, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
基于PaddleOCR搭建的OCR server... 离线部署用

开头说明 DangoOCR 是基于大家的 CPU处理器 来运行的,CPU处理器 的好坏会直接影响其速度, 但不会影响识别的精度 ,目前此版本识别速度可能在 0.5-3秒之间,具体取决于大家机器的配置,可以的话尽量不要在运行时开其他太多东西。需要配合团子翻译器 Ver3.6 及其以上的版本才可以使用!

胖次团子 131 Dec 25, 2022
Empowering journalists and whistleblowers

Onymochat Empowering journalists and whistleblowers Onymochat is an end-to-end encrypted, decentralized, anonymous chat application. You can also host

Samrat Dutta 19 Sep 02, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021