这是一个deeplabv3-plus-pytorch的源码，可以用于训练自己的模型。

Last update: Dec 28, 2022

Related tags

Overview

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在Pytorch当中的实现

性能情况

训练数据集	权值文件名称	测试数据集	输入图片大小	mIOU
VOC12+SBD	deeplab_mobilenetv2.pth	VOC-Val12	512x512	72.59
VOC12+SBD	deeplab_xception.pth	VOC-Val12	512x512	76.95

所需环境

torch==1.2.0

注意事项

代码中的deeplab_mobilenetv2.pth和deeplab_xception.pth是基于VOC拓展数据集训练的。训练和预测时注意修改backbone。

文件下载

训练所需的deeplab_mobilenetv2.pth和deeplab_xception.pth可在百度网盘中下载。
链接: https://pan.baidu.com/s/1KgLMbprQshlcpKgug9ECFg 提取码: 4ir8

VOC拓展数据集的百度网盘如下：
链接: https://pan.baidu.com/s/1BrR7AUM1XJvPWjKMIy2uEw 提取码: vszf

训练步骤

a、训练voc数据集

1、将我提供的voc数据集放入VOCdevkit中（无需运行voc_annotation.py）。
2、在train.py中设置对应参数，默认参数已经对应voc数据集所需要的参数了，所以只要修改backbone和model_path即可。
3、运行train.py进行训练。

b、训练自己的数据集

1、本文使用VOC格式进行训练。
2、训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的SegmentationClass中。
3、训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
4、在训练前利用voc_annotation.py文件生成对应的txt。
5、在train.py文件夹下面，选择自己要使用的主干模型和下采样因子。本文提供的主干模型有mobilenet和xception。下采样因子可以在8和16中选择。需要注意的是，预训练模型需要和主干模型相对应。
6、注意修改train.py的num_classes为分类个数+1。
7、运行train.py即可开始训练。

预测步骤

a、使用预训练权重

1、下载完库后解压，如果想用backbone为mobilenet的进行预测，直接运行predict.py就可以了；如果想要利用backbone为xception的进行预测，在百度网盘下载deeplab_xception.pth，放入model_data，修改deeplab.py的backbone和model_path之后再运行predict.py，输入。

img/street.jpg

可完成预测。
2、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

b、使用自己训练的权重

1、按照训练步骤训练。
2、在deeplab.py文件里面，在如下部分修改model_path、num_classes、backbone使其对应训练好的文件；model_path对应logs文件夹下面的权值文件，num_classes代表要预测的类的数量加1，backbone是所使用的主干特征提取网络。

_defaults = {
    #----------------------------------------#
    #   model_path指向logs文件夹下的权值文件
    #----------------------------------------#
    "model_path"        : 'model_data/deeplab_mobilenetv2.pth',
    #----------------------------------------#
    #   所需要区分的类的个数+1
    #----------------------------------------#
    "num_classes"       : 21,
    #----------------------------------------#
    #   所使用的的主干网络
    #----------------------------------------#
    "backbone"          : "mobilenet",
    #----------------------------------------#
    #   输入图片的大小
    #----------------------------------------#
    "input_shape"       : [512, 512],
    #----------------------------------------#
    #   下采样的倍数，一般可选的为8和16
    #   与训练时设置的一样即可
    #----------------------------------------#
    "downsample_factor" : 16,
    #--------------------------------#
    #   blend参数用于控制是否
    #   让识别结果和原图混合
    #--------------------------------#
    "blend"             : True,
    #-------------------------------#
    #   是否使用Cuda
    #   没有GPU可以设置成False
    #-------------------------------#
    "cuda"              : True,
}

3、运行predict.py，输入

img/street.jpg

可完成预测。
4、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

评估步骤

1、设置get_miou.py里面的num_classes为预测的类的数量加1。
2、设置get_miou.py里面的name_classes为需要去区分的类别。
3、运行get_miou.py即可获得miou大小。

Reference

https://github.com/ggyyzm/pytorch_segmentation
https://github.com/bonlime/keras-deeplab-v3-plus

这是一个yolo3-tf2的源码，可以用于训练自己的模型。

YOLOV3：You Only Look Once目标检测模型在Tensorflow2当中的实现目录性能情况 Performance 所需环境 Environment 文件下载 Download 训练步骤 How2train 预测步骤 How2predict 评估步骤 How2eval 参考资料

68 Dec 21, 2022

这是一个yolox-keras的源码，可以用于训练自己的模型。

YOLOX：You Only Look Once目标检测模型在Keras当中的实现目录性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

64 Nov 10, 2022

A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations

ASCIIMATICS Asciimatics is a package to help people create full-screen text UIs (from interactive forms to ASCII animations) on any platform. It is li

3.2k Jan 9, 2023

CUDA integration for Python, plus shiny features

PyCUDA lets you access Nvidia's CUDA parallel computation API from Python. Several wrappers of the CUDA API already exist-so what's so special about P

1.4k Jan 2, 2023

It is a personal assistant chatbot, capable to perform many tasks same as Google Assistant plus more extra features...

PersonalAssistant It is an Personal Assistant, capable to perform many tasks with some unique features, that you haven'e seen yet.... Features / Tasks

95 Dec 21, 2022

A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

532 Jan 2, 2023

Face Recognition plus identification simply and fast | Python

PyFaceDetection Face Recognition plus identification simply and fast Ubuntu Setup sudo pip3 install numpy sudo pip3 install cmake sudo pip3 install dl

16 Sep 22, 2022

Proof of Concept Exploit for ManageEngine ServiceDesk Plus CVE-2021-44077

CVE-2021-44077 Proof of Concept Exploit for CVE-2021-44077: PreAuth RCE in ManageEngine ServiceDesk Plus 11306 Based on: https://xz.aliyun.com/t/106

25 Nov 9, 2022

Enigma-Plus - Python based Enigma machine simulator with some extra features

Enigma-Plus Python based Enigma machine simulator with some extra features Examp

1 Jan 5, 2022

Plone Interface contracts, plus basic features and utilities

plone.base This package is the base package of the CMS Plone https://plone.org. It contains only interface contracts and basic features and utilitie

1 Oct 3, 2022

Strawberry-django-plus - Enhanced Strawberry GraphQL integration with Django

strawberry-django-plus Enhanced Strawberry integration with Django. Built on top

138 Dec 28, 2022

Comments

输出坐标

if self.mix_type == 0: seg_img = np.zeros((np.shape(pr)[0], np.shape(pr)[1], 3)) for c in range(self.num_classes): seg_img[:, :, 0] += ((pr[:, :] == c ) * self.colors[c][0]).astype('uint8') seg_img[:, :, 1] += ((pr[:, :] == c ) * self.colors[c][1]).astype('uint8') seg_img[:, :, 2] += ((pr[:, :] == c ) * self.colors[c][2]).astype('uint8')

请问如果我想得到这个猫的左上方坐标和右下方坐标该怎么print呢

opened by SSTato 3
TF/Keras 版和 PyTorch 版的性能差异

首先感谢提供多种版本的代码。我注意到同样是VOC12数据集，deeplabv3-plus-tf2/deeplabv3-plus-keras在测试集上的结果显著优于deeplabv3-plus-pytorch版本。是否因为前者使用的是dice_loss_with_CE而后者只使用DiceLoss进行训练？还是有其它别的原因？

opened by fyang93 3
关于替换为我自己的数据集训练的问题

我替换为自己的数据集进行训练，但是一直遇到ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])，这个报错，百度了都说是batch_size可能多出来了，把drop_last设置为true就好了。可是我看你的代码里是设置的true啊，请问up怎么解决呢，我这边cuda：10.1 pytorch：1.2.0

opened by Codingaworld 1
模型输出问题

老师好，想请问一下，默认导出onnx之后，模型输出是三通道的RGB图像，怎么将输出改为0或1的标签呀，比如我就对两个类进行分割（背景和目标），然后我希望输出二值图（背景为0，目标为1），就好像训练的数据集的标签一样模型转onnx格式之后，输出是3通道的float图，仔细观察后通道1和通道2都有点奇怪，通道3是我想要的（需要格式转换），怎么控制网络直接输出灰度标签呢

opened by YuriGao 4

Releases(v3.0)

v3.0(Apr 22, 2022)
重要更新

支持step、cos学习率下降法。

支持adam、sgd优化器选择。

支持学习率根据batch_size自适应调整。

支持不同预测模式的选择，单张图片预测、文件夹预测、视频预测、图片裁剪。

更新summary.py文件，用于观看网络结构。

增加了多GPU训练。

Source code(tar.gz)
Source code(zip)
v2.0(Mar 4, 2022)
重要更新

更新train.py文件，增加了大量的注释，增加多个可调整参数。

更新predict.py文件，增加了大量的注释，增加fps、视频预测、批量预测等功能。

更新deeplab.py文件，增加了大量的注释，增加先验框选择、置信度、非极大抑制等参数。

合并get_dr_txt.py、get_gt_txt.py和get_map.py文件，通过一个文件来实现数据集的评估。

更新voc_annotation.py文件，增加多个可调整参数。

更新summary.py文件，用于观看网络结构。

Source code(tar.gz)
Source code(zip)
v1.0(Sep 5, 2021)

Source code(tar.gz)
Source code(zip)
deeplab_mobilenetv2.pth(22.40 MB)
deeplab_xception.pth(209.61 MB)
mobilenet_v2.pth.tar(13.54 MB)
xception_pytorch_imagenet.pth(145.25 MB)

Owner

Bubbliiiing

GitHub Repository

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

45 Dec 12, 2022

Food recognition model using convolutional neural network & computer vision

Food recognition model using convolutional neural network & computer vision. The goal is to match or beat the DeepFood Research Paper

1 Jan 13, 2022

Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

24 Dec 12, 2022

Object Detection with YOLOv3

Object Detection with YOLOv3 Bu projede YOLOv3-608 modeli kullanılmıştır. Requirements Python 3.8 OpenCV Numpy Documentation Yolo ile ilgili detaylı b

0 Mar 27, 2022

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

2.9k Dec 28, 2022

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

270 Dec 31, 2022

这是一个deeplabv3-plus-pytorch的源码，可以用于训练自己的模型。

Related tags

Overview

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在Pytorch当中的实现

目录

性能情况

所需环境

注意事项

文件下载

训练步骤

a、训练voc数据集

b、训练自己的数据集

预测步骤

a、使用预训练权重

b、使用自己训练的权重

评估步骤

Reference

You might also like...

这是一个yolo3-tf2的源码，可以用于训练自己的模型。

这是一个yolox-keras的源码，可以用于训练自己的模型。

A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations

CUDA integration for Python, plus shiny features

It is a personal assistant chatbot, capable to perform many tasks same as Google Assistant plus more extra features...

A3C LSTM Atari with Pytorch plus A3G design

Face Recognition plus identification simply and fast | Python

Proof of Concept Exploit for ManageEngine ServiceDesk Plus CVE-2021-44077

Enigma-Plus - Python based Enigma machine simulator with some extra features

Plone Interface contracts, plus basic features and utilities

Strawberry-django-plus - Enhanced Strawberry GraphQL integration with Django

Comments

输出坐标

TF/Keras 版和 PyTorch 版的性能差异

关于替换为我自己的数据集训练的问题

模型输出问题

Releases(v3.0)

v3.0(Apr 22, 2022)

重要更新

v2.0(Mar 4, 2022)

重要更新

v1.0(Sep 5, 2021)

Owner

Bubbliiiing

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

Food recognition model using convolutional neural network & computer vision

Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

Object Detection with YOLOv3

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

This is code of book "Learn Deep Learning with PyTorch"

Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

Real-Time Semantic Segmentation in Mobile device

True Few-Shot Learning with Language Models

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN

Two-stage CenterNet

Let Python optimize the best stop loss and take profits for your TradingView strategy.

General-purpose program synthesiser

Efficient Multi Collection Style Transfer Using GAN

Neon: an add-on for Lightbulb making it easier to handle component interactions

Exploration-Exploitation Dilemma Solving Methods