The project was to detect traffic signs, based on the Megengine framework.

Overview

trafficsign

赛题

旷视AI智慧交通开源赛道,初赛1/177,复赛1/12。
本赛题为复杂场景的交通标志检测,对五种交通标志进行识别。

框架

megengine

算法方案

  • 网络框架

    • atss + resnext101_32x8d
  • 训练阶段

    • 图片尺寸
      最终提交版本输入图片尺寸为(1500,2100)

    • 多尺度训练(最终提交版本未采用)
      起初我们将短边设为(1024, 1056, 1088, 1120, 1152, 1184, 1216, 1248, 1280, 1312, 1344, 1376, 1408),随机选取短边后,长边按比例缩放,并使长边长度小于1800,从而进行多尺度训练,取得了很好的效果。 不过后期的mosaic和mixup在增强时对图片进行了缩放,实则隐含了多尺度训练,且效果优于上述方法,所以我们最终去掉了多尺度训练。

    • 数据增强

      • mosaic增强

        随机选择四张图片,对图片进行随机平移10%,尺度缩放(0.5,2.0),shear 0.1,最后将四张图片进行组合。

      • mixup增强

        随机选取两张图进行叠加,我们最终选用的比例是0.5 * 原图+0.5 * 新图片,同时其进行缩放(0.5,2.0)。

        下图为mosaic+mixup示例图:

        mosaic+mixup

      • 随机水平翻转

        直接对图片进行翻转,会导致第三个类别“arr_l”(左转线)和右转线混淆,故我们添加了class-aware的翻转,遇到有“arr_l”类的图片则不进行翻转。

      • 基于Albumentations库的各种增强(最终提交版本未采用)

        我们尝试了ShiftScaleRotate(验证集+0.5)、CLANE(验证集+1.0)、RandomBrightnessContrast等,但组合起来测试集提点欠佳,所以最后没用。

      • gridmask增强(最终提交版本未采用)

        生成一个和原图相同分辨率的mask(每个grid上全为0或全为1),然后将该mask与原图相乘得到一个图像。提点欠佳,所以没采用。

      • 类别平衡采样(最终提交版本未采用)

        使用类别平衡采样后,效果不是很好,这可能是因为数据集本身没有严重的类别不均衡。下面是我们统计的每个类别在图片中出现的频率。

        红灯 直行线 左转线 禁止行驶 禁止停车
        频率 0.356 0.228 0.201 0.257 0.485
  • 多尺度测试

    • 多尺度测试图片尺寸

      最后提交版本(2100,2700),(2100,2800),(2400,3200),如果继续增加尺度,map还会继续提高。

    • topk—nms

      对上述三个尺度生成的结果先进行nms,再将得到的结果框与剩下所有框进行topk—nms(保留与当前结果框iou大于0.85的topk的框,把这些框的坐标进行融合),参数设置vote_thresh=0.85, k=5。

  • 网络结构

    • 加上增强后,backbone从res50到res101再到resx101有稳定涨点。

    • 我们还在backbone部分尝试了dcn和gcnet,验证集收效甚微,最终没有采用。

模型训练与测试

  • 数据集位置
/path/to/ 
    |->traffic   
    |    |images     
    |    |annotations->|train.json     
    |    |             |val.json     
    |    |             |test.json      
  • 训练测试

在加上增强后,我们训练了36个epoch。

pip3 install --user -r requirements.txt

export PYTHONPATH=your_path/trafficsign:$PYTHONPATH

cd weights && wget https://data.megengine.org.cn/models/weights/atss_resx101_coco_2x_800size_45dot6_b3a91b36.pkl

python3 tools/train.py -n 4 -b 2 -f configs/atss_resx101_final.py -d your_datasetpath -w weights/atss_resx101_coco_2x_800size_45dot6_b3a91b36.pkl

python3 tools/test_final.py -n 4 -se 35 -f configs/atss_resx101_final.py -d your_datasetpath 

(-n 能抢到几张卡就写几吧qaq)

备注

以上提到的所有方法,无论最终是否采用,代码中均有实现。

感谢

https://github.com/MegEngine/Models/tree/master/official/vision/detection

https://github.com/MegEngine/YOLOX

Deep Learning Pipelines for Apache Spark

Deep Learning Pipelines for Apache Spark The repo only contains HorovodRunner code for local CI and API docs. To use HorovodRunner for distributed tra

Databricks 2k Jan 08, 2023
Use evolutionary algorithms instead of gridsearch in scikit-learn

sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn. This allows you to reduce the time required to find the best parameter

rsteca 709 Jan 03, 2023
EfficientDet (Scalable and Efficient Object Detection) implementation in Keras and Tensorflow

EfficientDet This is an implementation of EfficientDet for object detection on Keras and Tensorflow. The project is based on the official implementati

1.3k Dec 19, 2022
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022
The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

John Salib 2 Jan 30, 2022
This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize over continuous domains by Brandon Amos

Tutorial on Amortized Optimization This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize

Meta Research 144 Dec 26, 2022
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat

AIStream 1.2k Jan 04, 2023
Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral) Project | Paper Official PyTorch implementation of the pape

Eliahu Horwitz 393 Dec 22, 2022
Biomarker identification for COVID-19 Severity in BALF cells Single-cell RNA-seq data

scBALF Covid-19 dataset Analysis Here is the Github page that has the codes for the bioinformatics pipeline described in the paper COVID-Datathon: Bio

Nami Niyakan 2 May 21, 2022
Code for GNMR in ICDE 2021

GNMR Code for GNMR in ICDE 2021 Please unzip data files in Datasets/MultiInt-ML10M first. Run labcode_preSamp.py (with graph sampling) for ECommerce-c

7 Oct 27, 2022
PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick." [Project page] [Paper

Gyungin Shin 59 Sep 25, 2022
The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

Booz Allen Hamilton 112 Dec 13, 2022
A model that attempts to learn and benefit from data collected on card counting.

A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to c

1 Dec 17, 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

Aaron (Yinghao) Li 282 Jan 01, 2023
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these enviro

Google 1.5k Jan 02, 2023
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

Minimal Body A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image. The model file is only 51.2 MB and runs a

Yuxiao Zhou 49 Dec 05, 2022
Unofficial PyTorch code for BasicVSR

Dependencies and Installation The code is based on BasicSR, Please install the BasicSR framework first. Pytorch=1.51 Training cd ./code CUDA_VISIBLE_

Long 59 Dec 06, 2022
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023