NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

Overview

NanoDet-Plus

Super fast and high accuracy lightweight anchor-free object detection model. Real-time on mobile devices.

CI testing Codecov GitHub license Github downloads GitHub release (latest by date)

  • Super lightweight: Model file is only 980KB(INT8) or 1.8MB(FP16).
  • Super fast: 97fps(10.23ms) on mobile ARM CPU.
  • 👍 High accuracy: Up to 34.3 mAPval@0.5:0.95 and still realtime on CPU.
  • 🤗 Training friendly: Much lower GPU memory cost than other models. Batch-size=80 is available on GTX1060 6G.
  • 😎 Easy to deploy: Support various backends including ncnn, MNN and OpenVINO. Also provide Android demo based on ncnn inference framework.

Introduction

NanoDet is a FCOS-style one-stage anchor-free object detection model which using Generalized Focal Loss as classification and regression loss.

In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

NanoDet-Plus 知乎中文介绍

NanoDet 知乎中文介绍

QQ交流群:908606542 (答案:炼丹)


Benchmarks

Model Resolution mAPval
0.5:0.95
CPU Latency
(i7-8700)
ARM Latency
(4xA76)
FLOPS Params Model Size
NanoDet-m 320*320 20.6 4.98ms 10.23ms 0.72G 0.95M 1.8MB(FP16) | 980KB(INT8)
NanoDet-Plus-m 320*320 27.0 5.25ms 11.97ms 0.9G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m 416*416 30.4 8.32ms 19.77ms 1.52G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m-1.5x 320*320 29.9 7.21ms 15.90ms 1.75G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
NanoDet-Plus-m-1.5x 416*416 34.1 11.50ms 25.49ms 2.97G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
YOLOv3-Tiny 416*416 16.6 - 37.6ms 5.62G 8.86M 33.7MB
YOLOv4-Tiny 416*416 21.7 - 32.81ms 6.96G 6.06M 23.0MB
YOLOX-Nano 416*416 25.8 - 23.08ms 1.08G 0.91M 1.8MB(FP16)
YOLOv5-n 640*640 28.4 - 44.39ms 4.5G 1.9M 3.8MB(FP16)
FBNetV5 320*640 30.4 - - 1.8G - -
MobileDet 320*320 25.6 - - 0.9G - -

Download pre-trained models and find more models in Model Zoo or in Release Files

Notes (click to expand)
  • ARM Performance is measured on Kirin 980(4xA76+4xA55) ARM CPU based on ncnn. You can test latency on your phone with ncnn_android_benchmark.

  • Intel CPU Performance is measured Intel Core-i7-8700 based on OpenVINO.

  • NanoDet mAP(0.5:0.95) is validated on COCO val2017 dataset with no testing time augmentation.

  • YOLOv3&YOLOv4 mAP refers from Scaled-YOLOv4: Scaling Cross Stage Partial Network.


NEWS!!!

  • [2021.12.25] NanoDet-Plus release! Adding AGM(Assign Guidance Module) & DSLA(Dynamic Soft Label Assigner) to improve 7 mAP with only a little cost.

Find more update notes in Update notes.

Demo

Android demo

android_demo

Android demo project is in demo_android_ncnn folder. Please refer to Android demo guide.

Here is a better implementation 👉 ncnn-android-nanodet

NCNN C++ demo

C++ demo based on ncnn is in demo_ncnn folder. Please refer to Cpp demo guide.

MNN demo

Inference using Alibaba's MNN framework is in demo_mnn folder. Please refer to MNN demo guide.

OpenVINO demo

Inference using OpenVINO is in demo_openvino folder. Please refer to OpenVINO demo guide.

Web browser demo

https://nihui.github.io/ncnn-webassembly-nanodet/

Pytorch demo

First, install requirements and setup NanoDet following installation guide. Then download COCO pretrain weight from here

👉 COCO pretrain checkpoint

The pre-trained weight was trained by the config config/nanodet-plus-m_416.yml.

  • Inference images
python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH
  • Inference video
python demo/demo.py video --config CONFIG_PATH --model MODEL_PATH --path VIDEO_PATH
  • Inference webcam
python demo/demo.py webcam --config CONFIG_PATH --model MODEL_PATH --camid YOUR_CAMERA_ID

Besides, We provide a notebook here to demonstrate how to make it work with PyTorch.


Install

Requirements

  • Linux or MacOS
  • CUDA >= 10.0
  • Python >= 3.6
  • Pytorch >= 1.7
  • experimental support Windows (Notice: Windows not support distributed training before pytorch1.7)

Step

  1. Create a conda virtual environment and then activate it.
 conda create -n nanodet python=3.8 -y
 conda activate nanodet
  1. Install pytorch
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c conda-forge
  1. Install requirements
pip install Cython termcolor numpy tensorboard pycocotools matplotlib pyaml opencv-python tqdm pytorch-lightning torchmetrics
  1. Setup NanoDet
git clone https://github.com/RangiLyu/nanodet.git
cd nanodet
python setup.py develop

Model Zoo

NanoDet supports variety of backbones. Go to the config folder to see the sample training config files.

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m ShuffleNetV2 1.0x 320*320 20.6 0.72G 0.95M Download
NanoDet-Plus-m-320 (NEW) ShuffleNetV2 1.0x 320*320 27.0 0.9G 1.17M Weight | Checkpoint
NanoDet-Plus-m-416 (NEW) ShuffleNetV2 1.0x 416*416 30.4 1.52G 1.17M Weight | Checkpoint
NanoDet-Plus-m-1.5x-320 (NEW) ShuffleNetV2 1.5x 320*320 29.9 1.75G 2.44M Weight | Checkpoint
NanoDet-Plus-m-1.5x-416 (NEW) ShuffleNetV2 1.5x 416*416 34.1 2.97G 2.44M Weight | Checkpoint

Notice: The difference between Weight and Checkpoint is the weight only provide params in inference time, but the checkpoint contains training time params.

Legacy Model Zoo

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m-416 ShuffleNetV2 1.0x 416*416 23.5 1.2G 0.95M Download
NanoDet-m-1.5x ShuffleNetV2 1.5x 320*320 23.5 1.44G 2.08M Download
NanoDet-m-1.5x-416 ShuffleNetV2 1.5x 416*416 26.8 2.42G 2.08M Download
NanoDet-m-0.5x ShuffleNetV2 0.5x 320*320 13.5 0.3G 0.28M Download
NanoDet-t ShuffleNetV2 1.0x 320*320 21.7 0.96G 1.36M Download
NanoDet-g Custom CSP Net 416*416 22.9 4.2G 3.81M Download
NanoDet-EfficientLite EfficientNet-Lite0 320*320 24.7 1.72G 3.11M Download
NanoDet-EfficientLite EfficientNet-Lite1 416*416 30.3 4.06G 4.01M Download
NanoDet-EfficientLite EfficientNet-Lite2 512*512 32.6 7.12G 4.71M Download
NanoDet-RepVGG RepVGG-A0 416*416 27.8 11.3G 6.75M Download

How to Train

  1. Prepare dataset

    If your dataset annotations are pascal voc xml format, refer to config/nanodet_custom_xml_dataset.yml

    Or convert your dataset annotations to MS COCO format(COCO annotation format details).

  2. Prepare config file

    Copy and modify an example yml config file in config/ folder.

    Change save_path to where you want to save model.

    Change num_classes in model->arch->head.

    Change image path and annotation path in both data->train and data->val.

    Set gpu ids, num workers and batch size in device to fit your device.

    Set total_epochs, lr and lr_schedule according to your dataset and batchsize.

    If you want to modify network, data augmentation or other things, please refer to Config File Detail

  3. Start training

    NanoDet is now using pytorch lightning for training.

    For both single-GPU or multiple-GPUs, run:

    python tools/train.py CONFIG_FILE_PATH
  4. Visualize Logs

    TensorBoard logs are saved in save_dir which you set in config file.

    To visualize tensorboard logs, run:

    cd <YOUR_SAVE_DIR>
    tensorboard --logdir ./

How to Deploy

NanoDet provide multi-backend C++ demo including ncnn, OpenVINO and MNN. There is also an Android demo based on ncnn library.

Export model to ONNX

To convert NanoDet pytorch model to ncnn, you can choose this way: pytorch->onnx->ncnn

To export onnx model, run tools/export_onnx.py.

python tools/export_onnx.py --cfg_path ${CONFIG_PATH} --model_path ${PYTORCH_MODEL_PATH}

Run NanoDet in C++ with inference libraries

ncnn

Please refer to demo_ncnn.

OpenVINO

Please refer to demo_openvino.

MNN

Please refer to demo_mnn.

Run NanoDet on Android

Please refer to android_demo.


Citation

If you find this project useful in your research, please consider cite:

@misc{=nanodet,
    title={NanoDet-Plus: Super fast and high accuracy lightweight anchor-free object detection model.},
    author={RangiLyu},
    howpublished = {\url{https://github.com/RangiLyu/nanodet}},
    year={2021}
}

Thanks

https://github.com/Tencent/ncnn

https://github.com/open-mmlab/mmdetection

https://github.com/implus/GFocal

https://github.com/cmdbug/YOLOv5_NCNN

https://github.com/rbgirshick/yacs

Comments
  • 训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    File "nanodet-main/nanodet/trainer/trainer.py", line 89, in run_epoch results[meta['img_info']['id'].cpu().numpy()[0]] = dets AttributeError: 'list' object has no attribute 'cpu'

    opened by DL-Practise 16
  • Training nanodet from scratch

    Training nanodet from scratch

    Hi, I'm training NanoDet-m model (ShuffleNetV2 1.0x | 320*320) from scratch with Coco dataset and 4 GeForce RTX 2080 Ti. Convergence seems pretty slow, it could take 1-2 weeks.

    May I ask how long did it takes for you to reach 20.6 mAP, and which setup did you use?

    Thank you.

    bug help wanted 
    opened by Cloudz333 10
  • 关于项目部署的问题

    关于项目部署的问题

    你好,我想请教两个问题:

    1. nanodet.cpp文件中的NanoDet::detect(cv::Mat image, float score_threshold, float nms_threshold)函数中,给模型输入数据的时候是用的ex.input("input.1", input);,这里的input.1是什么意思呢,是输入层的名字吗,我怎么通过pytorch查看到这个名字呢,print(model)后没看到层的名字,在Tencent/ncnn/tree/master/examples 上看到基本上都是ex.input("input", input);,如果我加载自己训练的一个模型,这里应该怎么匹配?
    2. nadodet.h中,有一个 std::vector heads_info,这个里面的值具体是什么含义呢,是和网络输出有关的吗
        std::vector<HeadInfo> heads_info{
            // cls_pred|dis_pred|stride
                {"792", "795",    8},
                {"814", "817",   16},
                {"836", "839",   32},
        };
    

    对pytorch以及nano网络都不是很熟,望见谅。

    opened by busyyang 8
  • 运行demo.py时,出现了一个小问题.

    运行demo.py时,出现了一个小问题.

    我的运行环境: cuda==10.1 pytorch==1.7 torchvision==0.8.0 当我运行"python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH",尝试推理图片时, 出现错误: RuntimeError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. 'torchvision::nms' is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].

    CPU: registered at /root/project/torchvision/csrc/vision.cpp:59 [kernel] BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] AutogradOther: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback] AutogradCPU: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback] AutogradCUDA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback] AutogradXLA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback] Tracer: fallthrough registered at /pytorch/torch/csrc/jit/frontend/tracer.cpp:967 [backend fallback] Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback] Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback] VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

    但是当我把:/nanodet/nanodet/model/module/nms.py batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)函数改后:

    boxes_for_nms = boxes_for_nms.cpu()
    scores = scores.cpu()
    boxes = boxes.cpu()
    split_thr = nms_cfg_.pop('split_thr', 10000)
    if len(boxes_for_nms) < split_thr:
        # dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
        keep = nms(boxes_for_nms, scores, **nms_cfg_)
        boxes = boxes[keep]
        # scores = dets[:, -1]
        scores = scores[keep]
    

    demo.py正常运行.

    opened by lidongliang666 8
  • 加入mosaic后效果变差了,是什么原因

    加入mosaic后效果变差了,是什么原因

    coco.py

    if self.load_mosaic and not isval:
                img4, labels4, bbox4 = load_mosaic(self, idx)
                meta['img_info']['height'] = img4.shape[0]
                meta['img_info']['width'] = img4.shape[1]
                meta['img'] = img4
                meta['gt_labels'] = labels4
                meta['gt_bboxes'] = bbox4
    
    
            meta = self.pipeline(self, meta, input_size)
    
            meta["img"] = torch.from_numpy(meta["img"].transpose(2, 0, 1))
            return meta
    

    在ShapeTransform里测试打印出来的bbox是正常的

    meta_data["img"] = img
            meta_data["warp_matrix"] = M
            if "gt_bboxes" in meta_data:
                boxes = meta_data["gt_bboxes"]
                meta_data["gt_bboxes"] = warp_boxes(boxes, M, dst_shape[0], dst_shape[1])
            if "gt_masks" in meta_data:
                for i, mask in enumerate(meta_data["gt_masks"]):
                    meta_data["gt_masks"][i] = cv2.warpPerspective(
                        mask, M, dsize=tuple(dst_shape)
                    )
            for i in range(meta_data["gt_bboxes"].shape[0]):
                cv2.rectangle(img, (int(meta_data["gt_bboxes"][i][0]), int(meta_data["gt_bboxes"][i][1])), (int(meta_data["gt_bboxes"][i][2]), int(meta_data["gt_bboxes"][i][3])), (255,0,0), 2)
            cv2.imwrite('./%d.jpg' % int(meta_data["gt_bboxes"][0][0]), img)
    

    有什么可能的原因导致的?

    opened by Rokuki 6
  • Cannot find blob with name: dis_pred_stride_8

    Cannot find blob with name: dis_pred_stride_8

    使用demo_ncnn和demo_openvino测试转换预训练模型,转换过程均正常,但是预测时候出现问题,想问下怎么解决?

    # demo_ncnn
    find_blob_index_by_name input.1 failed
    Try
    find_blob_index_by_name dis_pred_stride_8 failed
    Try
    find_blob_index_by_name cls_pred_stride_8 failed
    
    # demo_openvino
    start init model
    success
    terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
    what(): Cannot find blob with name: dis_pred_stride_8
    

    发现onnx模型存在dis_pred_stride_8等节点,但是转换后的ncnn模型这几个节点消失 onnx网络结构: onnx ncnn网络结构: ncnn

    opened by TTMRonald 6
  • Cannot find blob with name: 795

    Cannot find blob with name: 795

    转换的是NanoDet-EfficientLite 512x512这个模型,openvino版本为2021.3.394,能够正常转换,并在程序中加载成功,但推理的时候报错,日志如下: start init model success terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException' what(): Cannot find blob with name: 795 有人遇到过吗

    opened by deep-practice 6
  • CoreML export failure: 'ConvModule' object has no attribute 'norm'

    CoreML export failure: 'ConvModule' object has no attribute 'norm'

    Hi, I tried to turn the nanodet-m.pth to coreml for IOS. I used coremltools as the guide, and got error "CoreML export failure: 'ConvModule' object has no attribute 'norm'". I read the source code of nanodet found that the norm in head is BN which should be supported by coreml. So I do not know why is the error happening. Is anyone has tried coreml? Thanks!

    opened by ghoshaw 6
  • No result while using single-class nano model in ncnn

    No result while using single-class nano model in ncnn

    Hi,我训练了一个person类的nanodet模型,然后通过tool/export.py转为onnx,然后转为ncnn的model,但是发现ncnn的model没有输出,我更改了cpp代码中的类别与图片size,不知道是在转换onnx时候出错还是onnx->NCNN时候出错了。下面是我训练时候的cfg

    #Config File example
    save_dir: workspace/nanodet_m
    model:
      arch:
        name: GFL
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: PAN
          in_channels: [116, 232, 464]
          out_channels: 96
          start_level: 0
          num_outs: 3
        head:
          name: NanoDetHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          share_cls_reg: True
          octave_base_scale: 5
          scales_per_octave: 1
          strides: [8, 16, 32]
          reg_max: 7
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
    data:
      train:
        name: coco
        img_path: ../data/yoga_coco/images/train2017
        ann_path: ../data/yoga_coco/annotations/instances_train2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[1, 1], [1, 1]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.8, 1.2]
          saturation: [0.8, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: coco
        img_path: ../data/yoga_coco/images/val2017
        ann_path: ../data/yoga_coco/annotations/instances_val2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 40
    schedule:
    #  resume:
    #  load_model: YOUR_MODEL_PATH
      optimizer:
        name: SGD
        lr: 0.14
        momentum: 0.9
        weight_decay: 0.0001
      warmup:
        name: linear
        steps: 300
        ratio: 0.1
      total_epochs: 50
      lr_schedule:
        name: MultiStepLR
        milestones: [130,160,175,185]
        gamma: 0.1
      val_intervals: 10
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    
    log:
      interval: 10
    
    class_names: ['person',]
    

    当我使用80类的model时,转化为ncnn有结果,所以想问问 当转化成single-class时候,有什么配置是需要再修改一下的。

    opened by Sean-hku 6
  • pth转onnx转ncnn问题

    pth转onnx转ncnn问题

    您好,我想问一下,我这边用pytorch模型转onnx再转ncnn模型,最后用ncnn模型检测结果不对。 有几个修改: 将config中的val输入改为64x64,将tools/export.py的输入大小改为64x64 python tools/export.py python -m onnxsim output.onnx output-sim.onnx build/tools/onnx/onnx2ncnn output-sim.onnx output-sim.param output-sim.bin build/tools/ncnnoptimize output-sim.param output-sim.bin new-output-sim.param new-output-sim.bin 0 这样操作是这样的 pytorch用的1.7.1 onnx 1.8.0 onnx-simplifier 0.2.19 onnxoptimizer 0.1.1 onnxruntime 1.6.0

    是哪里操作有问题吗?

    opened by yhl41001 6
  • original pytorch or onnx model

    original pytorch or onnx model

    Could you please provide pretrained pytorch or onnx model weights also? I noticed you only shared converted ncnn models, but I would like to see the speed of inference on gpu/npu accelerated systems

    opened by kadirbeytorun 6
  •  python tools/train.py  config/nanodet-plus-m_320.yml

    python tools/train.py config/nanodet-plus-m_320.yml

    Tried to : python tools/train.py config/nanodet-plus-m_320.yml error: pytorch_lightning.utilities.cloud_io.get_filesystem has been deprecated in v1.8.0 and will be" [NanoDet][01-04 10:28:00]INFO:Setting up data... loading annotations into memory... Done (t=18.55s) creating index... index created! loading annotations into memory... Done (t=0.56s) creating index... index created! [NanoDet][01-04 10:28:21]INFO:Creating model... model size is 1.0x init weights... => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth Finish initialize NanoDet-Plus Head. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/torch/cuda/init.py:143: UserWarning: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

    warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name)) LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

    | Name | Type | Params

    0 | model | NanoDetPlus | 4.3 M 1 | avg_model | NanoDetPlus | 4.3 M

    8.7 M Trainable params 0 Non-trainable params 8.7 M Total params 34.647 Total estimated model params size (MB) [NanoDet][01-04 10:28:21]INFO:Weight Averaging is enabled /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:229: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 40 which is the number of cpus on this machine) in theDataLoader` init to improve performance. category=PossibleUserWarning, Traceback (most recent call last): File "tools/train.py", line 146, in main(args) File "tools/train.py", line 141, in main trainer.fit(task, train_dataloader, val_dataloader) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 604, in fit self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run results = self._run_stage() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage self._run_train() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train self.fit_loop.run() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/loop.py", line 194, in run self.on_run_start(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 206, in on_run_start self.trainer.reset_train_dataloader(self.trainer.lightning_module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1552, in reset_train_dataloader if has_len_all_ranks(self.train_dataloader, self.strategy, module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/utilities/data.py", line 110, in has_len_all_ranks if total_length == 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    python3.7 cuda==10.2 gpu==RT3090 UBUNTU20.04

    Thanks

    opened by molyswu 0
  • Fails to train a model on a dataset with single class.

    Fails to train a model on a dataset with single class.

    I used the converted COCO 2017 with only labeled persons. Вот мой config:

    save_dir: workspace/nanodet-plus-m_416
    model:
      weight_averager:
        name: ExpMovingAverager
        decay: 0.9998
      arch:
        name: NanoDetPlus
        detach_epoch: 10
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: GhostPAN
          in_channels: [116, 232, 464]
          out_channels: 96
          kernel_size: 5
          num_extra_level: 1
          use_depthwise: True
          activation: LeakyReLU
        head:
          name: NanoDetPlusHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          kernel_size: 5
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
        # Auxiliary head, only use in training time.
        aux_head:
          name: SimpleConvHead
          num_classes: 1
          input_channel: 192
          feat_channels: 192
          stacked_convs: 4
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
    data:
      train:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/train/data
        ann_path: /home/mosminin/fiftyone/coco_person/train/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[0.8, 1.2], [0.8, 1.2]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.6, 1.4]
          saturation: [0.5, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/validation/data
        ann_path: /home/mosminin/fiftyone/coco_person/validation/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 16
    schedule:
    #  resume:
    #  load_model:
      optimizer:
        name: AdamW
        lr: 0.001
        weight_decay: 0.05
      warmup:
        name: linear
        steps: 500
        ratio: 0.0001
      total_epochs: 300
      lr_schedule:
        name: CosineAnnealingLR
        T_max: 300
        eta_min: 0.00005
      val_intervals: 10
    grad_clip: 35
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    log:
      interval: 50
    
    class_names: ['person']
    

    I also changed the train.py to use CPU instead of GPU the errors were more understandable.

        # if cfg.device.gpu_ids == -1:
        #     logger.info("Using CPU training")
        #     accelerator, devices, strategy = "cpu", None, None
        # else:
        #     accelerator, devices, strategy = "gpu", cfg.device.gpu_ids, None
    
        accelerator, devices, strategy = "cpu", None, None # CPU training
    
    

    After running it, I get the following errors.

    (.venv) [email protected]:~/dev/nanodet$ python tools/train.py /home/mosminin/dev/nanodet/config/nanodet-plus-m_416_person.yml
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v1.10.0. Please use `lightning_lite.utilities.cloud_io.get_filesystem` instead.
      rank_zero_deprecation(
    [NanoDet][12-18 14:05:30]INFO:Setting up data...
    loading annotations into memory...
    Done (t=4.35s)
    creating index...
    index created!
    loading annotations into memory...
    Done (t=0.16s)
    creating index...
    index created!
    [NanoDet][12-18 14:05:35]INFO:Creating model...
    model size is  1.0x
    init weights...
    => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth
    Finish initialize NanoDet-Plus Head.
    GPU available: True (cuda), used: False
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/setup.py:175: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
      rank_zero_warn(
    
      | Name      | Type        | Params
    ------------------------------------------
    0 | model     | NanoDetPlus | 4.1 M 
    1 | avg_model | NanoDetPlus | 4.1 M 
    ------------------------------------------
    8.2 M     Trainable params
    0         Non-trainable params
    8.2 M     Total params
    32.903    Total estimated model params size (MB)
    [NanoDet][12-18 14:05:35]INFO:Weight Averaging is enabled
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
      return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    Traceback (most recent call last):
      File "/home/mosminin/dev/nanodet/tools/train.py", line 147, in <module>
        main(args)
      File "/home/mosminin/dev/nanodet/tools/train.py", line 142, in main
        trainer.fit(task, train_dataloader, val_dataloader)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
        call._call_and_handle_interrupt(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
        self._run(model, ckpt_path=self.ckpt_path)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
        results = self._run_stage()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
        self._run_train()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
        self.fit_loop.run()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
        self._outputs = self.epoch_loop.run(self._data_fetcher)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 214, in advance
        batch_output = self.batch_loop.run(kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
        outputs = self.optimizer_loop.run(optimizers, kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 200, in advance
        result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 247, in _run_optimization
        self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 357, in _optimizer_step
        self.trainer._call_lightning_module_hook(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 281, in optimizer_step
        optimizer.step(closure=optimizer_closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 169, in step
        step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 234, in optimizer_step
        return self.precision_plugin.optimizer_step(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 121, in optimizer_step
        return optimizer.step(closure=closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
        return wrapped(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/optimizer.py", line 140, in wrapper
        out = func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 120, in step
        loss = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 107, in _wrap_closure
        closure_result = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 147, in __call__
        self._result = self.closure(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 133, in closure
        step_output = self._step_fn()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 406, in _training_step
        training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1480, in _call_strategy_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 378, in training_step
        return self.model.training_step(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 78, in training_step
        preds, loss, loss_states = self.model.forward_train(batch)
      File "/home/mosminin/dev/nanodet/nanodet/model/arch/nanodet_plus.py", line 56, in forward_train
        loss, loss_states = self.head.loss(head_out, gt_meta, aux_preds=aux_head_out)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 198, in loss
        batch_assign_res = multi_apply(
      File "/home/mosminin/dev/nanodet/nanodet/util/misc.py", line 24, in multi_apply
        return tuple(map(list, zip(*map_results)))
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 314, in target_assign_single_img
        assign_result = self.assigner.assign(
      File "/home/mosminin/dev/nanodet/nanodet/model/head/assigner/dsl_assigner.py", line 86, in assign
        F.one_hot(gt_labels.to(torch.int64), pred_scores.shape[-1])
    RuntimeError: Class values must be smaller than num_classes.
    
    

    What am I doing wrong?

    opened by Octopusmode 0
  • Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Hey, I'm not too familiar with machine learning and the like, and I'm not exactly ready to spend the next 2 months (yet) learning how tensor-flow works and such, so I'm hoping someone can assist me with this.

    So far, my experience with nanodet has been great; but, manually annotating images takes a lot of time which I don't have; because I don't really need the bounding box information anyway, I assumed I'd seek for a way to only give the center of objects rather than the top left and bottom right corners.

    Help would be highly appreciated 😄

    opened by icecreamnotallowed 0
  • The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    def image_preprocess(img_path): img = cv2.imread(img_path).astype("float32")/255 # mean = [103.53, 116.28, 123.675] # Image net values # std = [57.375, 57.12, 58.395] mean = [113.533554, 118.14172, 123.63607] std = [21.405144, 21.405144, 21.405144] mean = np.array(mean, dtype=np.float32).reshape(1, 1, 3) / 255 std = np.array(std, dtype=np.float32).reshape(1, 1, 3) / 255 img = (img - mean) / std img = np.transpose(img, (2, 0, 1)) img = np.expand_dims(img, axis=0) return img

    def test_onnx_model(onnx_model,img_path=None): if img_path is None: img_path = "path for img" imgdata = image_preprocess(img_path) sess = rt.InferenceSession(onnx_model) input_name = sess.get_inputs()[0].name output_detect_name = sess.get_outputs()[0].name pred_onnx0= sess.run([output_detect_name], {input_name: imgdata}) print("outputs:") print(np.array(pred_onnx0))

    opened by Genlk 0
  • Fixes a couple of issues to add fp16 training support

    Fixes a couple of issues to add fp16 training support

    There were a couple of issues when trying to use fp16 training. For one was that it was not exposed through the configuration system. The other was that the DynamicSoftLabelAssigner used binary_cross_entropy instead of binary_cross_entropy_with_logits. This changes where sigmoid is called on the predictions so that the more stable binary_cross_entropy_with_logits can be used and the Trainer can be configured to use fp16 precision.

    opened by crisp-snakey 0
Releases(v1.0.0-alpha-1)
  • v1.0.0-alpha-1(Dec 26, 2021)

    NanoDet-Plus v1.0.0-alpha

    In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

    image

    Model |Resolution| mAPval
    0.5:0.95 |CPU Latency
    (i7-8700) |ARM Latency
    (4xA76) | FLOPS | Params | Model Size :-------------:|:--------:|:-------:|:--------------------:|:--------------------:|:----------:|:---------:|:-------: NanoDet-m | 320320 | 20.6 | 4.98ms | 10.23ms | 0.72G | 0.95M | 1.8MB(FP16) | 980KB(INT8) NanoDet-Plus-m | 320320 | 27.0 | 5.25ms | 11.97ms | 0.9G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m | 416416 | 30.4 | 8.32ms | 19.77ms | 1.52G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m-1.5x | 320320 | 29.9 | 7.21ms | 15.90ms | 1.75G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) NanoDet-Plus-m-1.5x | 416416 | 34.1 | 11.50ms | 25.49ms | 2.97G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) YOLOv3-Tiny | 416416 | 16.6 | - | 37.6ms | 5.62G | 8.86M | 33.7MB YOLOv4-Tiny | 416416 | 21.7 | - | 32.81ms | 6.96G | 6.06M | 23.0MB YOLOX-Nano | 416416 | 25.8 | - | 23.08ms | 1.08G | 0.91M | 1.8MB(FP16) YOLOv5-n | 640640 | 28.4 | - | 44.39ms | 4.5G | 1.9M | 3.8MB(FP16) FBNetV5 | 320640 | 30.4 | - | - | 1.8G | - | - MobileDet | 320*320 | 25.6 | - | - | 0.9G | - | -

    Model checkpoints and weights

    Download in the release files.

    Source code(tar.gz)
    Source code(zip)
    nanodet-plus-m-1.5x_320.onnx(9.43 MB)
    nanodet-plus-m-1.5x_320_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416.onnx(9.43 MB)
    nanodet-plus-m-1.5x_416_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416_ncnn.zip(4.40 MB)
    nanodet-plus-m-1.5x_416_openvino.zip(4.39 MB)
    nanodet-plus-m_320.onnx(4.57 MB)
    nanodet-plus-m_320_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416.onnx(4.57 MB)
    nanodet-plus-m_416_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416_mnn.mnn(4.59 MB)
    nanodet-plus-m_416_ncnn.zip(2.11 MB)
    nanodet-plus-m_416_openvino.zip(2.11 MB)
  • v0.4.2(Aug 22, 2021)

    v0.4.2

    Fix some compatibility issue of NanoDet v0.4

    Fix pytorch-lightning compatibility. (#304 #309 ) Fix pytorch1.9 compatibility. (#308 ) Support not raising an error when evaluate with empty results. (#310)

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Jul 17, 2021)

    v0.4.1

    This is a final release of NanoDet v0.x.

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jun 8, 2021)

    What's new in v0.4.0

    1. Fix a little bug in demo.py by BlainWu (#210)
    2. Add script to export TorchScript model by strawberrypie (#211)
    3. Use fixed output names when exporting ONNX (#218)
    4. Use scale_factor instead of fixed size in resize to support dynamic shape inference (#218)
    5. Ensure num_classes equal len(class_names) by ZHEQIUSHUI (#221)
    6. Fix a bug in mnn demo while using GPU device by AcherStyx (#234)
    7. Fix with_last_conv bug in shufflenet (#239)
    8. Support batch eval (#241)
    9. Add nanodet-m-1.5x models (#242)
    10. Update model benchmark (#246)
    11. Prevent lightning Trainer from disabling cudnn.benchmark (#249)
    12. Fix multi-GPU evaluation bug with pytorch-lightning (#254)

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Download ncnn models below

    Source code(tar.gz)
    Source code(zip)
    ncnn-nanodet-m-1.5x-416-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x-416.zip(3.67 MB)
    ncnn-nanodet-m-1.5x-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x.zip(3.66 MB)
    ncnn-nanodet-m-416-int8.zip(882.58 KB)
    ncnn-nanodet-m-416.zip(1.64 MB)
    ncnn-nanodet-m-int8.zip(888.76 KB)
    ncnn-nanodet-m.zip(1.64 MB)
  • v0.3.0(Apr 11, 2021)

    What's new in v0.3.0

    1. Refactor training and testing code with pytorch-lightning.
    2. Solving ONNX inference AxisError by zshn25 (#198).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
    nanodet_m_ncnn_model.zip(1.64 MB)
  • v0.2.0(Mar 29, 2021)

    What's new in v0.2.0

    1. Add pyncnn demo by caishanli (#167).
    2. Fix ncnn demo build failure without vulkan by nihui (#168).
    3. Add NanoDet-t with Transformer Attention Network (#183).
    4. Add Notebook demo by zhiqwang (#188).
    5. Add feature of saving demo inference result by wwdok (#191).
    6. Fix utf-8 decode bug (#184).
    7. Fix test bug.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Mar 7, 2021)

    What's new in v0.1.0

    1. Support MNN python and cpp inference (#83 ).
    2. Support OpenVINO inference.
    3. Support libtorch inference experimentally.
    4. Add NanoDet-g.
    5. Add EfficientNet-Lite and Rep-VGG backbone.
    6. Add Model Zoo and provide more pre-trained model.
    7. Refactor GFL head (#154 ).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416*416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Nov 22, 2020)

Owner
Away From Keyboard
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023
Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

Consistent Depth of Moving Objects in Video This repository contains training code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in

Google 203 Jan 05, 2023
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑‍🎓 👩‍⚖️ Dataset Summary Inspired by the recent widespread use of th

95 Dec 08, 2022
Neural Architecture Search Powered by Swarm Intelligence 🐜

Neural Architecture Search Powered by Swarm Intelligence 🐜 DeepSwarm DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle

288 Oct 28, 2022
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

VidLanKD Implementation of VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer by Zineng Tang, Jaemin Cho, Hao Tan, Mohi

Zineng Tang 54 Dec 20, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
A Kaggle competition: discriminate gender based on handwriting

Gender discrimination based on handwriting See http://fastml.com/gender-discrimination/ for description. prep_data.py - a first step chunk_by_authors.

Zygmunt Zając 22 Jul 20, 2022
Style transfer, deep learning, feature transform

FastPhotoStyle License Copyright (C) 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons

NVIDIA Corporation 10.9k Jan 02, 2023
Civsim is a basic civilisation simulation and modelling system built in Python 3.8.

Civsim Introduction Civsim is a basic civilisation simulation and modelling system built in Python 3.8. It requires the following packages: perlin_noi

17 Aug 08, 2022
A full-fledged version of Pix2Seq

Stable-Pix2Seq A full-fledged version of Pix2Seq What it is. This is a full-fledged version of Pix2Seq. Compared with unofficial-pix2seq, stable-pix2s

peng gao 205 Dec 27, 2022
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

SymmetryNet SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images ACM Transactions on Gra

26 Dec 05, 2022
Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Satellite labelling tool About this app A tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, ri

Czech Hydrometeorological Institute - Satellite Department 10 Sep 14, 2022
[NeurIPS2021] Code Release of Learning Transferable Perturbations

Learning Transferable Adversarial Perturbations This is an official release of the paper Learning Transferable Adversarial Perturbations. The code is

Krishna Kanth 17 Nov 11, 2022
Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y

Hsiao-Yu Fish Tung 18 Dec 19, 2022
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
Code accompanying the paper "Wasserstein GAN"

Wasserstein GAN Code accompanying the paper "Wasserstein GAN" A few notes The first time running on the LSUN dataset it can take a long time (up to an

3.1k Jan 01, 2023
Human-Pose-and-Motion History

Human Pose and Motion Scientist Approach Eadweard Muybridge, The Galloping Horse Portfolio, 1887 Etienne-Jules Marey, Descent of Inclined Plane, Chron

Daito Manabe 47 Dec 16, 2022
The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

EMANet News The bug in loading the pretrained model is now fixed. I have updated the .pth. To use it, download it again. EMANet-101 gets 80.99 on the

Xia Li 李夏 663 Nov 30, 2022
BRepNet: A topological message passing system for solid models

BRepNet: A topological message passing system for solid models This repository contains the an implementation of BRepNet: A topological message passin

Autodesk AI Lab 42 Dec 30, 2022