A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Last update: Dec 21, 2022

Overview

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing

This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightweight YOLO"(CSL-YOLO),

it is achieving better detection performance with only 43% FLOPs and 52% parameters than Tiny-YOLOv4.

Paper Link: https://arxiv.org/abs/2107.04829

Requirements

How to Get Started?

#Predict
python3 main.py -p cfg/predict_coco.cfg

#Train
python3 main.py -t cfg/train_coco.cfg

#Eval
python3 main.py -ce cfg/eval_coco.cfg

WebCam DEMO(on CPU)

This DEMO runs on a pure CPU environment, the CPU is I7-6600U(2.6Ghz~3.4Ghz), the model scale is 224x224, and the FPS is about 10.

Please execute the following script to get this DEMO, the "camera_idx" in the cfg file represents the camera number you specified.

#Camera DEMO
python3 main.py -d cfg/demo_coco.cfg

More Info

Change Model Scale

The model's default scale is 224x224, if you want to change the scale to 320~512,

please go to cfg/XXXX.cfg and change the following two parts:

# input_shape=[512,512,3]
# out_hw_list=[[64,64],[48,48],[32,32],[24,24],[16,16]]
# input_shape=[416,416,3]
# out_hw_list=[[52,52],[39,39],[26,26],[20,20],[13,13]]
# input_shape=[320,320,3]
# out_hw_list=[[40,40],[30,30],[20,20],[15,15],[10,10]]
input_shape=[224,224,3]
out_hw_list=[[28,28],[21,21],[14,14],[10,10],[7,7]]

weight_path=weights/224_nolog.hdf5

                         |
                         | 224 to 320
                         V
                         
# input_shape=[512,512,3]
# out_hw_list=[[64,64],[48,48],[32,32],[24,24],[16,16]]
# input_shape=[416,416,3]
# out_hw_list=[[52,52],[39,39],[26,26],[20,20],[13,13]]
input_shape=[320,320,3]
out_hw_list=[[40,40],[30,30],[20,20],[15,15],[10,10]]
# input_shape=[224,224,3]
# out_hw_list=[[28,28],[21,21],[14,14],[10,10],[7,7]]

weight_path=weights/320_nolog.hdf5

Fully Dataset

The entire MS-COCO data set is too large, here only a few pictures are stored for DEMO,

if you need complete data, please download on this page.

Our Data Format

We did not use the official format of MS-COCO, we expressed a bounding box as following:

[ left_top_x<float>, left_top_y<float>, w<float>, h<float>, confidence<float>, class<str> ]

The bounding boxes contained in a picture are represented by single json file.

For detailed format, please refer to the json file in "data/coco/train/json".

AP Performance on MS-COCO

For detailed COCO report, please refer to "mscoco_result".

TODOs

Improve the calculator script of FLOPs.
Using Focal Loss will cause overfitting, we need to explore the reasons.

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Related tags

Overview

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing

Requirements

How to Get Started?

WebCam DEMO(on CPU)

More Info

Change Model Scale

Fully Dataset

Our Data Format

AP Performance on MS-COCO

TODOs

Owner

Miles Zhang

Personalized Federated Learning using Pytorch (pFedMe)

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Heterogeneous Temporal Graph Neural Network

Springer Link Download Module for Python

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

An updated version of virtual model making

Permeability Prediction Via Multi Scale 3D CNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

SmartSim Infrastructure Library.

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

This repository contains the official MATLAB implementation of the TDA method for reverse image filtering

RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth, in ICCV 2021 (oral)

Cowsay - A rewrite of cowsay in python

Tensorflow 2 implementations of the C-SimCLR and C-BYOL self-supervised visual representation methods from "Compressive Visual Representations" (NeurIPS 2021)

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Auto-Lama combines object detection and image inpainting to automate object removals

Drone Task1 - Drone Task1 With Python

Contrastive Language-Image Pretraining