This repo is customed for VisDrone.

Overview

Object Detection for VisDrone(无人机航拍图像目标检测)

My environment

1、Windows10 (Linux available)
2、tensorflow >= 1.12.0
3、python3.6 (anaconda)
4、cv2
5、ensemble-boxes(pip install ensemble-boxes)

Datasets(XML format for training set)

(1).Datasets is available on https://github.com/VisDrone/VisDrone-Dataset
(2).Please download xml annotations on Baidu Yun (提取码: ia3f), or Google Drive, and configure it in ./core/config/cfgs.py
(3).You can also use ./data/visdrone2xml.py to generate your visdrone xml files, modify the path information.

training-set format:

├── VisDrone2019-DET-train
│     ├── Annotation(xml format)
│     ├── JPEGImages

Pretrained Models(ResNet50vd, 101vd)

Please download pretrained models on Baidu Yun (提取码: krce), or Google Drive, then put it into ./data/pretrained_weights

Train

Modify the parameters in ./core/config/cfgs.py
python train_step.py

Eval

Modify the parameters in ./core/config/cfgs.py
python eval_visdrone.py, it will get txt format file, then use official matlab tools to eval the final results.
python eval_model_ensemble.py. Before the running of this file, you should set NORMALIZED_RESULTS_FOR_MODEL_ENSEMBLE=True in cfgs.py and then run eval_visdrone.py to get normalized txt result.

Visualization

Modify the parameters in ./core/config/cfgs.py
python image_demo.py, it will get visualized results.

Visualized Result (multi-scale training+multi-scale testing) 1

Test Result(Validation set):

1. ResNet50-vd

Name maxDets Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95) maxDets=500 31.26%/35.1%
Average Precision (AP) @( IoU=0.50 ) maxDets=500 56.44%/60.29%
Average Precision (AP) @( IoU=0.75 ) maxDets=500 30.13%/35.42%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 1 0.78%/0.58%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 10 6.62%/6.05%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=100 38.21%/40.99%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=500 48.41%/53%
"s" means single-scale training + single-scale testing; "m"means multi-scale training + multi-scale testing

2. ResNet101-vd

Name maxDets Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95) maxDets=500 31.7%/35.98%
Average Precision (AP) @( IoU=0.50 ) maxDets=500 56.94%/61.64%
Average Precision (AP) @( IoU=0.75 ) maxDets=500 30.59%/36.13%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 1 0.67%/0.61%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 10 6.29%/6.13%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=100 38.66%/42.33%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=500 49.29%/53.68%

3. Model Ensemble (ResNet101-vd+ResNet50-vd)

Name maxDets Result
Average Precision (AP) @( IoU=0.50:0.95) maxDets=500 36.76%
Average Precision (AP) @( IoU=0.50 ) maxDets=500 62.33%
Average Precision (AP) @( IoU=0.75 ) maxDets=500 37.41%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 1 0.59%
Average Recall (AR) @( IoU=0.50:0.95) maxDets= 10 6.06%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=100 42.57%
Average Recall (AR) @( IoU=0.50:0.95) maxDets=500 54.53%
You can download trained weights(ResNet50vd, 101vd) on Baidu Yun (提取码: 9u9m), or Google Drive, then put it into ./saved_weights

Reference

1、https://github.com/DetectionTeamUCAS/Faster-RCNN_Tensorflow
2、https://github.com/open-mmlab/mmdetection
3、https://github.com/ZFTurbo/Weighted-Boxes-Fusion
4、https://github.com/kobiso/CBAM-tensorflow-slim
5、https://github.com/SJTU-Thinklab-Det/DOTA-DOAI
6、https://github.com/Viredery/tf-eager-fasterrcnn
7、https://github.com/VisDrone/VisDrone2018-DET-toolkit
8、https://github.com/YunYang1994/tensorflow-yolov3
9、https://github.com/zhpmatrix/VisDrone2018

Learning a mapping from images to psychological similarity spaces with neural networks.

LearningPsychologicalSpaces v0.1: v1.1: v1.2: v1.3: v1.4: v1.5: The code in this repository explores learning a mapping from images to psychological s

Lucas Bechberger 8 Dec 12, 2022
Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

⚠️ ‎‎‎ A more recent and actively-maintained version of this code is available in ivadomed Stacked Hourglass Network with a Multi-level Attention Mech

Reza Azad 14 Oct 24, 2022
A New Approach to Overgenerating and Scoring Abstractive Summaries

We provide the source code for the paper "A New Approach to Overgenerating and Scoring Abstractive Summaries" accepted at NAACL'21. If you find the code useful, please cite the following paper.

Kaiqiang Song 4 Apr 03, 2022
Framework for training options with different attention mechanism and using them to solve downstream tasks.

Using Attention in HRL Framework for training options with different attention mechanism and using them to solve downstream tasks. Requirements GPU re

5 Nov 03, 2022
Benchmarking the robustness of Spatial-Temporal Models

Benchmarking the robustness of Spatial-Temporal Models This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal

Yi Chenyu Ian 15 Dec 16, 2022
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 05, 2023
a basic code repository for basic task in CV(classification,detection,segmentation)

basic_cv a basic code repository for basic task in CV(classification,detection,segmentation,tracking) classification generate dataset train predict de

1 Oct 15, 2021
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Objax Apache-2Objax (🥉19 · ⭐ 580) - Objax is a machine learning framework that provides an Object.. Apache-2 jax

Objax Tutorials | Install | Documentation | Philosophy This is not an officially supported Google product. Objax is an open source machine learning fr

Google 729 Jan 02, 2023
Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

Renzhe Xu 6 Oct 26, 2022
This code is an implementation for Singing TTS.

MLP Singer This code is an implementation for Singing TTS. The algorithm is based on the following papers: Tae, J., Kim, H., & Lee, Y. (2021). MLP Sin

Heejo You 22 Dec 23, 2022
Image reconstruction done with untrained neural networks.

PyTorch Deep Image Prior An implementation of image reconstruction methods from Deep Image Prior (Ulyanov et al., 2017) in PyTorch. The point of the p

Atiyo Ghosh 192 Nov 30, 2022
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Polyp-PVT by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao. This repo is the official implementation of "Polyp-PVT: Polyp Se

Deng-Ping Fan 102 Jan 05, 2023
Fast Style Transfer in TensorFlow

Fast Style Transfer in TensorFlow Add styles from famous paintings to any photo in a fraction of a second! You can even style videos! It takes 100ms o

Jefferson 5 Oct 24, 2021
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 08, 2023
DyNet: The Dynamic Neural Network Toolkit

The Dynamic Neural Network Toolkit General Installation C++ Python Getting Started Citing Releases and Contributing General DyNet is a neural network

Chris Dyer's lab @ LTI/CMU 3.3k Jan 06, 2023
PAthological QUpath Obsession - QuPath and Python conversations

PAQUO: PAthological QUpath Obsession Welcome to paquo 👋 , a library for interacting with QuPath from Python. paquo's goal is to provide a pythonic in

Bayer AG 60 Dec 31, 2022
BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer Project Page | Paper | Video State-of-the-art image-to-image translatio

47 Dec 06, 2022
Human Dynamics from Monocular Video with Dynamic Camera Movements

Human Dynamics from Monocular Video with Dynamic Camera Movements Ri Yu, Hwangpil Park and Jehee Lee Seoul National University ACM Transactions on Gra

215 Jan 01, 2023
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

19 Oct 27, 2022