OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Overview

PyPI docs badge codecov license issue resolution open issues

Documentation: https://mmsegmentation.readthedocs.io/

English | 简体中文

Introduction

MMSegmentation is an open source semantic segmentation toolbox based on PyTorch. It is a part of the OpenMMLab project.

The master branch works with PyTorch 1.3+.

demo image

Major features

  • Unified Benchmark

    We provide a unified benchmark toolbox for various semantic segmentation methods.

  • Modular Design

    We decompose the semantic segmentation framework into different components and one can easily construct a customized semantic segmentation framework by combining different modules.

  • Support of multiple methods out of box

    The toolbox directly supports popular and contemporary semantic segmentation frameworks, e.g. PSPNet, DeepLabV3, PSANet, DeepLabV3+, etc.

  • High efficiency

    The training speed is faster than or comparable to other codebases.

License

This project is released under the Apache 2.0 license.

Changelog

v0.12.0 was released in 04/03/2021. Please refer to changelog.md for details and release history.

Benchmark and model zoo

Results and models are available in the model zoo.

Supported backbones:

Supported methods:

Installation

Please refer to get_started.md for installation and dataset preparation.

Get Started

Please see train.md and inference.md for the basic usage of MMSegmentation. There are also tutorials for customizing dataset, designing data pipeline, customizing modules, and customizing runtime. We also provide many training tricks.

A Colab tutorial is also provided. You may preview the notebook here or directly run on Colab.

Citation

If you find this project useful in your research, please consider cite:

@misc{mmseg2020,
    title={{MMSegmentation}: OpenMMLab Semantic Segmentation Toolbox and Benchmark},
    author={MMSegmentation Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmsegmentation}},
    year={2020}
}

Contributing

We appreciate all contributions to improve MMSegmentation. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

MMSegmentation is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new semantic segmentation methods.

Projects in OpenMMLab

  • MMCV: OpenMMLab foundational library for computer vision.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
Comments
  • [Feature] Support Segmenter

    [Feature] Support Segmenter

    Motivation

    Support Segmenter . Copy of #952 not using the master branch.

    Modification

    I added configuration files to train segmenter on ADE20K and Cityscapes. I also added a script to convert the original ViT checkpoints in JAX to checkpoints compatible with the ViT class of mmsegmentation.

    To be done

    • use img_norm_cfg=[127.5, 127.5, 127.5] as default for ViT checkpoints
    • checkpoints, I reported the performances in the readme for the ones I trained

    Update results (2022-01-06)

    config | | mIoU (SS) | Official Repo mIoU (SS/MS) | Paddleseg results mIoU (SS) -- | -- | -- | -- | -- segmenter_vit-t_mask_8x1_512x512_160k_ade20k | mode='slide', crop_size=(512, 512), stride=(480, 480) | 39.99 | 38.1 / 38.8 | Not shown segmenter_vit-s_linear_8x1_512x512_160k_ade20k | mode='slide', crop_size=(512, 512), stride=(480, 480) | 45.75 | Not shown | 45.48 segmenter_vit-s_mask_8x1_512x512_160k_ade20k | mode='slide', crop_size=(512, 512), stride=(480, 480) | 46.19 | 45.3 / 46.9 | 45.15 segmenter_vit-b_mask_8x1_512x512_160k_ade20k | mode='slide', crop_size=(512, 512), stride=(480, 480) | 49.6 | 48.5 / 50.0 | 48.49 segmenter_vit-l_mask_8x1_512x512_160k_ade20k | mode='slide', crop_size=(640, 640), stride=(608, 608) | 52.16 | 51.8 / 53.6 | Not shown

    opened by rstrudel 45
  • [Feature] Add Cutout transform

    [Feature] Add Cutout transform

    opened by lkm2835 41
  • TypeError: EncoderDecoder: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'

    TypeError: EncoderDecoder: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug There was an error when I ran the code, both in predicting and training, and hopefully someone can help me answer that question. Thanks!

    Reproduction

    1. What command or script did you run?

      from mmseg.apis import init_segmentor, inference_segmentor, show_result_pyplot
      from mmseg.core.evaluation import get_palette
      config_file = './configs/swin/upernet_swin_base_patch4_window7_512x512_160k_ade20k.py'
      checkpoint_file = './checkpoints/upernet_swin_base_patch4_window7_512x512.pth'
      model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
      

      error:

      TypeError                                 Traceback (most recent call last)
      /usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py in build_from_cfg(cfg, registry, default_args)
       50     try:
       ---> 51         return obj_cls(**args)
       52     except Exception as e:
       TypeError: __init__() got an unexpected keyword argument 'embed_dim'
       During handling of the above exception, another exception occurred
       TypeError                                 Traceback (most recent call last)
       11 frames
       TypeError: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'
       During handling of the above exception, another exception occurred:
       TypeError                                 Traceback (most recent call last)
       /usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py in build_from_cfg(cfg, registry, default_args)
       52     except Exception as e:
       53         # Normal TypeError does not print class name.
       ---> 54         raise type(e)(f'{obj_cls.__name__}: {e}')
       55 
       56 
      TypeError: EncoderDecoder: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'
      

    Environment

       ```shell
       fatal: not a git repository (or any of the parent directories): .git
       2021-08-03 08:12:00,889 - mmseg - INFO - Environment info:
       ------------------------------------------------------------
       sys.platform: linux
       Python: 3.7.11 (default, Jul  3 2021, 18:01:19) [GCC 7.5.0]
       CUDA available: True
       GPU 0: Tesla T4
       CUDA_HOME: /usr/local/cuda
       NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
       GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
       PyTorch: 1.6.0+cu101
       PyTorch compiling details: PyTorch built with:
       - GCC 7.3
       - C++ Version: 201402
       - Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
       - Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
       - OpenMP 201511 (a.k.a. OpenMP 4.5)
       - NNPACK is enabled
       - CPU capability usage: AVX2
       - CUDA Runtime 10.1
       - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
       - CuDNN 7.6.3
       - Magma 2.5.2
       - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 
       
       TorchVision: 0.7.0+cu101
       OpenCV: 4.1.2
       MMCV: 1.3.10
       MMCV Compiler: GCC 7.3
       MMCV CUDA Compiler: 10.1
       MMSegmentation: 0.15.0+
       ------------------------------------------------------------
       ```
    

    Error traceback

    • the traceback when i run python './tools/train.py' './configs/swin/upernet_swin_base_patch4_window7_512x512_160k_ade20k.py'.
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
        return obj_cls(**args)
    TypeError: __init__() got an unexpected keyword argument 'embed_dim'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
        return obj_cls(**args)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/models/segmentors/encoder_decoder.py", line 35, in __init__
        self.backbone = builder.build_backbone(backbone)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/models/builder.py", line 17, in build_backbone
        return BACKBONES.build(cfg)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 210, in build
        return self.build_func(*args, **kwargs, registry=self)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
        return build_from_cfg(cfg, registry, default_args)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    TypeError: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/content/drive/MyDrive/Colab Notebooks/Swin-Transformer-Semantic-Segmentation/tools/train.py", line 163, in <module>
        main()
      File "/content/drive/MyDrive/Colab Notebooks/Swin-Transformer-Semantic-Segmentation/tools/train.py", line 133, in main
        test_cfg=cfg.get('test_cfg'))
      File "/usr/local/lib/python3.7/dist-packages/mmseg/models/builder.py", line 46, in build_segmentor
        cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 210, in build
        return self.build_func(*args, **kwargs, registry=self)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
        return build_from_cfg(cfg, registry, default_args)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    TypeError: EncoderDecoder: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'
    
    • the config
    2021-08-03 08:12:00,890 - mmseg - INFO - Distributed training: False
    2021-08-03 08:12:01,217 - mmseg - INFO - Config:
    norm_cfg = dict(type='SyncBN', requires_grad=True)
    model = dict(
        type='EncoderDecoder',
        pretrained=None,
        backbone=dict(
            type='SwinTransformer',
            embed_dim=128,
            depths=[2, 2, 18, 2],
            num_heads=[4, 8, 16, 32],
            window_size=7,
            mlp_ratio=4.0,
            qkv_bias=True,
            qk_scale=None,
            drop_rate=0.0,
            attn_drop_rate=0.0,
            drop_path_rate=0.3,
            ape=False,
            patch_norm=True,
            out_indices=(0, 1, 2, 3),
            use_checkpoint=False),
        decode_head=dict(
            type='UPerHead',
            in_channels=[128, 256, 512, 1024],
            in_index=[0, 1, 2, 3],
            pool_scales=(1, 2, 3, 6),
            channels=512,
            dropout_ratio=0.1,
            num_classes=150,
            norm_cfg=dict(type='SyncBN', requires_grad=True),
            align_corners=False,
            loss_decode=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
        auxiliary_head=dict(
            type='FCNHead',
            in_channels=512,
            in_index=2,
            channels=256,
            num_convs=1,
            concat_input=False,
            dropout_ratio=0.1,
            num_classes=150,
            norm_cfg=dict(type='SyncBN', requires_grad=True),
            align_corners=False,
            loss_decode=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
        train_cfg=dict(),
        test_cfg=dict(mode='whole'))
    dataset_type = 'ADE20KDataset'
    data_root = 'data/ade/ADEChallengeData2016'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    crop_size = (512, 512)
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations', reduce_zero_label=True),
        dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
        dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
        dict(type='RandomFlip', prob=0.5),
        dict(type='PhotoMetricDistortion'),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_semantic_seg'])
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(2048, 512),
            flip=False,
            transforms=[
                dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'])
            ])
    ]
    data = dict(
        samples_per_gpu=2,
        workers_per_gpu=4,
        train=dict(
            type='ADE20KDataset',
            data_root='data/ade/ADEChallengeData2016',
            img_dir='images/training',
            ann_dir='annotations/training',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadAnnotations', reduce_zero_label=True),
                dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
                dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
                dict(type='RandomFlip', prob=0.5),
                dict(type='PhotoMetricDistortion'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
                dict(type='DefaultFormatBundle'),
                dict(type='Collect', keys=['img', 'gt_semantic_seg'])
            ]),
        val=dict(
            type='ADE20KDataset',
            data_root='data/ade/ADEChallengeData2016',
            img_dir='images/validation',
            ann_dir='annotations/validation',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(2048, 512),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]),
        test=dict(
            type='ADE20KDataset',
            data_root='data/ade/ADEChallengeData2016',
            img_dir='images/validation',
            ann_dir='annotations/validation',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(2048, 512),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]))
    log_config = dict(
        interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    cudnn_benchmark = True
    optimizer = dict(
        type='AdamW',
        lr=6e-05,
        betas=(0.9, 0.999),
        weight_decay=0.01,
        paramwise_cfg=dict(
            custom_keys=dict(
                absolute_pos_embed=dict(decay_mult=0.0),
                relative_position_bias_table=dict(decay_mult=0.0),
                norm=dict(decay_mult=0.0))))
    optimizer_config = dict()
    lr_config = dict(
        policy='poly',
        warmup='linear',
        warmup_iters=1500,
        warmup_ratio=1e-06,
        power=1.0,
        min_lr=0.0,
        by_epoch=False)
    runner = dict(type='IterBasedRunner', max_iters=160000)
    checkpoint_config = dict(by_epoch=False, interval=16000)
    evaluation = dict(interval=16000, metric='mIoU')
    work_dir = './work_dirs/upernet_swin_base_patch4_window7_512x512_160k_ade20k'
    gpu_ids = range(0, 1)
    
    opened by nocur 32
  •  [Feature] Support LoveDA dataset

    [Feature] Support LoveDA dataset

    Old pr of LoveDA dataset is here: https://github.com/open-mmlab/mmsegmentation/pull/1006.

    Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

    Motivation

    Please describe the motivation of this PR and the goal you want to achieve through this PR.

    Modification

    Please briefly describe what modification is made in this PR.

    BC-breaking (Optional)

    Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

    Use cases (Optional)

    If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

    Checklist

    1. Pre-commit or other linting tools are used to fix the potential lint issues.
    2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
    3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
    4. The documentation has been modified accordingly, like docstring or example tutorials.
    opened by Junjue-Wang 25
  • How to make the unet?

    How to make the unet?

    Hi, the mmsegmentation is very great and useful. But,when I change for the Unet,some questions defused me.I'd like to ask you a few questions,Thx.

    1. The backbone type = ‘Unet’,what is the decode_head? I see the source code of unet is already made up of encode and decode。
    2. If I want to change the network for the unet, for example using the resnet. Is there any convenient operation? Please forgive for my poor English,Thanks a lot.
    opened by lzcstar 24
  • [Feature] Add MultiImageMixDataset

    [Feature] Add MultiImageMixDataset

    Modification

    • mmseg/datasets/builder.py: Add (if cfg['type'] == 'MultiImageMixDataset'])

    • mmseg/datasets/dataset_wrappers.py: Add class MultiImageMixDataset

    • tests/test_data/test_dataset.py: Add unittests

    Use cases (Optional)

    train_pipeline = [
        dict(type='Mosaic'),
        dict(type='Resize', img_scale=(1024, 512), keep_ratio=True),
        dict(type='RandomFlip', prob=0.5),
        dict(type='Normalize', **img_norm_cfg),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_semantic_seg']),
    ]
    
    train_dataset = dict(
        type='MultiImageMixDataset',
        dataset=dict(
            classes=classes,
            palette=palette,
            type=dataset_type,
            reduce_zero_label=False, 
            img_dir=data_root + "images/train",
            ann_dir=data_root + "annotations/train",
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadAnnotations'),
            ]  
        ),
        pipeline=train_pipeline
    )
    

    use case in mmdet


    Original code: MultiImageMixDataset in mmdet Related: Issue#1045, Pull Request#1093

    opened by lkm2835 23
  • [Fix] Fix dist training infinite waiting issue

    [Fix] Fix dist training infinite waiting issue

    Motivation

    If the log_vars has different length, GPUs will wait infinitely. This PR provides a fix for this by raising an assertion error when different GPUs have different length of log_vars.

    Related: #1034

    Modification

    Add a cross-GPU communication to determine whether the GPUs have the same log_var length. If not, raise an assertion error.

    BC-breaking (Optional)

    None.

    Use cases (Optional)

    None.

    Checklist

    • [x] Pre-commit or other linting tools are used to fix the potential lint issues.
    • [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
    • [ ] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
    • [ ] The documentation has been modified accordingly, like docstring or example tutorials.
    opened by fingertap 23
  • 添加CRF模块

    添加CRF模块

    Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

    Motivation

    由于没有后处理操作,所以添加了crf,尝试提高模型的准确率

    Modification

    1.首先添加了CRF模块 2.添加了pspcrf网络模型,主要就是将crf加在了pspnet的输出,也可以看作后处理操作

    BC-breaking (Optional)

    Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

    Use cases (Optional)

    Checklist

    opened by 459737087 21
  • Distributed training hangs due to missing keys in `mmseg.segmentors.base.BaseSegmentor._parse_losses`

    Distributed training hangs due to missing keys in `mmseg.segmentors.base.BaseSegmentor._parse_losses`

    When training on multiple GPUs, my code of customized model get stuck. When training on only one GPU, it works good. Ctrl+C gives me the following error stack:

    Traceback (most recent call last):
      File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 173, in <module>
        main()
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 169, in main
        run(args)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/run.py", line 624, in run
        )(*cmd_args)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 116, in __call__
        return launch_agent(self._config, self._entrypoint, list(args))
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
        return f(*args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 238, in launch_agent
        result = agent.run()
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/elastic/metrics/api.py", line 125, in wrapper
        result = f(*args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/elastic/agent/server/api.py", line 700, in run
        result = self._invoke_run(role)
      File "/usr/local/lib/python3.6/dist-packages/torch/distributed/elastic/agent/server/api.py", line 828, in _invoke_run
        time.sleep(monitor_interval)
    KeyboardInterrupt
    

    I cannot find many useful information online. Any advices on how to debug further?

    Environment:

    ------------------------------------------------------------
    sys.platform: linux
    Python: 3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0]
    CUDA available: True
    GPU 0,1,2,3,4,5,6,7: GeForce GTX 1080 Ti
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 10.2, V10.2.89
    GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    PyTorch: 1.9.0+cu102
    PyTorch compiling details: PyTorch built with:
      - GCC 7.3
      - C++ Version: 201402
      - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 10.2
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
      - CuDNN 7.6.5
      - Magma 2.5.2
      - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
    
    TorchVision: 0.10.0+cu102
    OpenCV: 4.5.3
    MMCV: 1.3.14
    MMCV Compiler: GCC 7.3
    MMCV CUDA Compiler: 10.2
    MMSegmentation: 0.18.0+ef68770
    ------------------------------------------------------------
    
    opened by fingertap 19
  • [TensorRT] Is batching possible ?

    [TensorRT] Is batching possible ?

    I set dynamic-export for pytorch2onnx and then I set --min-shape 10 3 224 224 --max-shape 10 3 224 224 for onnx2tensorrt. However, when I run a batch of 10 images through the tensorrt engine file generated, I get this error:

    [TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::445, condition: batchSize > 0 && 
    
    batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 10, but engine max batch size was: 1
    
    
    opened by timothylimyl 19
  • [Feature] Support video demo

    [Feature] Support video demo

    Motivation

    Video Demo, use local video file or webcam as input, visualize predictions by imshow or generating local video file.

    Use cases (Optional)

    PS: Not sure if I should upload a demo video demo/demo.mp4.

    python demo/video_demo.py ${VIDEO_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${DEVICE_NAME}] [--palette-thr ${PALETTE}] \
        [--show] [--show-wait-time {SHOW_WAIT_TIME}] [--output-file {OUTPUT_FILE}] [--output-fps {OUTPUT_FPS}] \
        [--output-height {OUTPUT_HEIGHT}] [--output-width {OUTPUT_WIDTH}] [--opacity {OPACITY}]
    

    Examples:

    python demo/video_demo.py demo/demo.mp4 configs/cgnet/cgnet_680x680_60k_cityscapes.py \
        checkpoints/cgnet_680x680_60k_cityscapes_20201101_110253-4c0b2f2d.pth \
        --device cuda:0 --palette cityscapes --show
    
    opened by irvingzhang0512 17
  • 如何在tensorboard中显示某个类别的评估指标;如何根据某个类别的指标保存模型

    如何在tensorboard中显示某个类别的评估指标;如何根据某个类别的指标保存模型

    mmseg 1.x分支

    问题

    tensorboard

    如图,在tensorboard中只显示了平均的这些指标,不知道能否展示某些类别的曲线。比如说有个类是息肉,能不能画出息肉的Dice曲线。 image

    模型保存

    CheckpointHook可以根据save_best参数,来保存最佳的模型。有个疑问,不知道是否可以根据某个类的指标来保存模型。比如说息肉类的Dice数值。

    Backlog 1.x 
    opened by lolikonloli 2
  • Ask for help. I can not train the same mIoU officially provided.

    Ask for help. I can not train the same mIoU officially provided.

    I trained DeeplabV3+ on Pascal Voc 2012 with the default config deeplabv3plus_r101-d8_512x512_40k_voc12aug.py. However, I got the result: mIoU: 0.6035 I don't know why. I downloaded the checkpoint from [(https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus)] and I test it with the same config and the result was the same as it mentioned.

    Here is my config. norm_cfg = dict(type='SyncBN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained='open-mmlab://resnet101_v1c', backbone=dict( type='ResNetV1c', depth=101, num_stages=4, out_indices=(0, 1, 2, 3), dilations=(1, 1, 2, 4), strides=(1, 2, 1, 1), norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False, style='pytorch', contract_dilation=True), decode_head=dict( type='DepthwiseSeparableASPPHead', in_channels=2048, in_index=3, channels=512, dilations=(1, 12, 24, 36), c1_in_channels=256, c1_channels=48, dropout_ratio=0.1, num_classes=21, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), auxiliary_head=dict( type='FCNHead', in_channels=1024, in_index=2, channels=256, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=21, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), train_cfg=dict(), test_cfg=dict(mode='whole')) dataset_type = 'PascalVOCDataset' data_root = 'data/VOCdevkit/VOC2012' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type='PascalVOCDataset', data_root='data/VOCdevkit/VOC2012', img_dir='JPEGImages', ann_dir=['SegmentationClass', 'SegmentationClassAug'], split=[ 'ImageSets/Segmentation/train.txt', 'ImageSets/Segmentation/aug.txt' ], pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ]), val=dict( type='PascalVOCDataset', data_root='data/VOCdevkit/VOC2012', img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='PascalVOCDataset', data_root='data/VOCdevkit/VOC2012', img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] cudnn_benchmark = True optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) optimizer_config = dict() lr_config = dict(policy='poly', power=0.9, min_lr=0.0001, by_epoch=False) runner = dict(type='IterBasedRunner', max_iters=40000) checkpoint_config = dict(by_epoch=False, interval=4000) evaluation = dict(interval=4000, metric='mIoU', pre_eval=True) work_dir = 'mzmseg/deeplabv3p_40k_vocaug' gpu_ids = [0] auto_resume = False And here is the log. 20230106_002856.log Help wanted.

    awaiting response Usage 
    opened by JimmyMa99 6
  • Getting 0 metric scores when inferencing on test data

    Getting 0 metric scores when inferencing on test data

    I have trained the transformer models for a custom dataset with 2 classes. But when I try testing this model on the test dataset with the same classes, it predicts almost all pixels as background. What can I do to get better results?

    opened by ajinkya-ch 3
  • Incorrect documentation for tutorial in Random mosaic Please update

    Incorrect documentation for tutorial in Random mosaic Please update

    Here I want to report for the incorrect documentation for tutorial in Random mosaic

    The official documentation for Random mosaic is at here https://mmsegmentation.readthedocs.io/en/latest/tutorials/customize_datasets.html#multi-image-mix-dataset

    where the document provides an example for random mosaic usage:

    train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='RandomMosaic', prob=1), dict(type='Resize', img_scale=(1024, 512), keep_ratio=True), dict(type='RandomFlip', prob=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ]

    train_dataset = dict( type='MultiImageMixDataset', dataset=dict( classes=classes, palette=palette, type=dataset_type, reduce_zero_label=False, img_dir=data_root + "images/train", ann_dir=data_root + "annotations/train", pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), ] ), pipeline=train_pipeline )

    However, the usage is not correct. Because it calls
    dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), twice, which totally mess up the augmentation. The resulting segmentation got another augmentation in top left ceil of returned mask.

    To correct, remove either one function call of dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), could be enough.

    Similar errors show up in issue #1207 https://github.com/open-mmlab/mmsegmentation/issues/1207

    @MengzhangLI Can you guys investigate?

    opened by Cli98 0
  • Freeze Backbone while training, it executes self._freeze_stages() function on each training iteration

    Freeze Backbone while training, it executes self._freeze_stages() function on each training iteration

    In the class definition of each backbone file(i.e. swin.py, resnet.py,....), there exist a train function, which is defined as below: #resnet.py** def train(self, mode=True): """Convert the model into training mode while keep normalization layer freezed.""" super(ResNet, self).train(mode) self._freeze_stages() if mode and self.norm_eval: for m in self.modules(): # trick: eval have effect on BatchNorm only if isinstance(m, _BatchNorm): m.eval()

    #swin.py** def train(self, mode=True): """Convert the model into training mode while keep layers freezed.""" super(SwinTransformer, self).train(mode) self._freeze_stages()

    it will be executed at every iteration when training: If we don't implement freezing backbone(self.frozen_stages < 0), it will have no effect. However, if the backbone is frozen(self.frozen_stages >= 0), the freeze setting operation will be performed every iteration, which is time-consuming. In my opinion, the freeze operation should only be performed once after model initialization. I don't know if there are other considerations for such a setting, or is it just a bug?

    opened by whiteinblue 0
Releases(v1.0.0rc3)
  • v1.0.0rc3(Dec 31, 2022)

    What's new

    Highlights

    • Support test time augmentation (#2184)
    • Add 'Projects/' folder and the first example project (#2412)

    Features

    • Add Biomedical 3D array random crop transform (#2378)

    Documentation

    • Add Chinese version of config tutorial (#2371)
    • Add Chinese version of train & test tutorial (#2355)
    • Add Chinese version of overview ((#2397)))
    • Add Chinese version of get_started (#2417)
    • Add datasets in Chinese (#2387)
    • Add dataflow document (#2403)
    • Add pspnet model structure graph (#2437)
    • Update some content of engine Chinese documentation (#2341)
    • Update TTA to migration documentation (#2335)

    Bug fix

    • Remove dependency mmdet when do not use MaskFormerHead and MMDET_Mask2FormerHead (#2448)

    Enhancement

    • Add torch1.13 checking in CI (#2402)
    • Fix pytorch version for merge stage test (#2449)

    New Contributors

    • @nijkah made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2024
    • @matrixgame2018 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2148
    • @kitecats made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2259
    • @nulam made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2382
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc2(Dec 6, 2022)

    What's new

    Highlights

    • Support MaskFormer (#2215)
    • Support Mask2Former (#2255)

    Features

    • Add ResizeShortestEdge transform (#2339)
    • Support padding in data pre-processor for model testing(#2290)
    • Fix the problem of post-processing not removing padding (#2367)

    Bug fix

    • Fix links in README (#2024)
    • Fix swin load state_dict (#2304)
    • Fix typo of BaseSegDataset docstring (#2322)
    • Fix the bug in the visualization step (#2326)
    • Fix ignore class id from -1 to 255 in BaseSegDataset (#2332)
    • Fix KNet IterativeDecodeHead bug (#2334)
    • Add input argument for datasets (#2379)
    • Fix typo in warning on binary classification (#2382)

    Enhancement

    • Fix ci for 1.x (#2011, #2019)
    • Fix lint and pre-commit hook (#2308)
    • Add data string in .gitignore file in dev-1.x branch (#2336)
    • Make scipy as a default dependency in runtime (#2362)
    • Delete mmcls in runtime.txt (#2368)

    Documentation

    • Update configuration documentation (#2048)
    • Update inference documentation (#2052)
    • Update the documentation for model training and testing (#2061)
    • Update get started documentation (#2148)
    • Update transforms documentation (#2088)
    • Add MMEval projects like in README (#2259)
    • Translate the visualization documentation (#2298)

    New Contributors

    • @nijkah made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2024
    • @matrixgame2018 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2148
    • @kitecats made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2259
    • @nulam made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2382
    Source code(tar.gz)
    Source code(zip)
  • v0.29.1(Nov 3, 2022)

    v0.29.1 (11/3/2022)

    New Features

    • Add model ensemble tools (#2218)

    Bug Fixes

    • Use SyncBN in MobileNetV2 (#2207)

    Documentation

    • Update FAQ doc about binary segmentation and ReduceZeroLabel (#2206)
    • Fix typos (#2249)
    • Fix model results (#2190, #2114)

    Contributors

    • @isLinXu made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2219
    • @zhijiejia made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2218
    • @lee-jinhee made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2249
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc1(Nov 2, 2022)

    Changelog

    v1.0.0rc1 (2/11/2022)

    Highlights

    • Support PoolFormer (#2191)
    • Add Decathlon dataset (#2227)

    Features

    • Add BioMedical data loading (#2176)
    • Add LIP dataset (#2251)
    • GenerateEdge data transform (#2210)

    Bug fix

    • Fix segmenter-vit-s_fcn config (#2037)
    • Fix binary segmentation (#2101)
    • Fix MMSegmentation colab demo (#2089)
    • Fix ResizeToMultiple transform (#2185)
    • Use SyncBN in mobilenet_v2 (#2198)
    • Fix typo in installation (#2175)
    • Fix typo in visualization.md (#2116)

    Enhancement

    • Add mim extras_requires in setup.py (#2012)
    • Fix CI (#2029)
    • Remove ops module (#2063)
    • Add pyupgrade pre-commit hook (#2078)
    • Add out_file in add_datasample of SegLocalVisualizer to directly save image (#2090)
    • Upgrade pre commit hooks (#2154)
    • Ignore test timm in CI when torch<1.7 (#2158)
    • Update requirements (#2186)
    • Fix Windows platform CI (#2202)

    Documentation

    • Add Overview documentation (#2042)
    • Add Evaluation documentation (#2077)
    • Add Migration documentation (#2066)
    • Add Structures documentation (#2070)
    • Add Structures ZN documentation (#2129)
    • Add Engine ZN documentation (#2157)
    • Update Prepare datasets and Visualization doc (#2054)
    • Update Models documentation (#2160)
    • Update Add New Modules documentation (#2067)
    • Fix the installation commands in get_started.md (#2174)
    • Add MMYOLO to README.md (#2220)

    New Contributors

    • @ice-tong made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2012
    • @Li-Qingyun made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2220
    Source code(tar.gz)
    Source code(zip)
  • v0.29.0(Oct 10, 2022)

    Changelog

    v0.29.0 (10/10/2022)

    New Features

    • Support PoolFormer (CVPR'2022) (#1537)

    Enhancement

    • Improve structure and readability for FCNHead (#2142)
    • Support IterableDataset in distributed training (#2151)
    • Upgrade .dev scripts (#2020)
    • Upgrade pre-commit hooks (#2155)

    Bug Fixes

    • Fix mmseg.api.inference inference_segmentor (#1849)
    • fix bug about label_map in evaluation part (#2075)
    • Add missing dependencies to torchserve docker file (#2133)
    • Fix ddp unittest (#2060)

    Contributors

    • @jinwonkim93 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1849
    • @rlatjcj made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2075
    • @ShirleyWangCVR made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2151
    • @mangelroman made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/2133
    Source code(tar.gz)
    Source code(zip)
  • v0.28.0(Sep 8, 2022)

    Changelog

    V0.28.0 (9/8/2022)

    New Features

    • Support Tversky Loss (#1896)

    Bug Fixes

    Contributors

    • @suchot made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1844
    • @TimoK93 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1992
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc0(Aug 31, 2022)

    We are excited to announce the release of MMSegmentation 1.0.0rc0. MMSeg 1.0.0rc0 is the first version of MMSegmentation 1.x, a part of the OpenMMLab 2.0 projects. Built upon the new training engine, MMSeg 1.x unifies the interfaces of dataset, models, evaluation, and visualization with faster training and testing speed.

    Highlights

    1. New engines MMSeg 1.x is based on MMEngine, which provides a general and powerful runner that allows more flexible customizations and significantly simplifies the entrypoints of high-level interfaces.

    2. Unified interfaces As a part of the OpenMMLab 2.0 projects, MMSeg 1.x unifies and refactors the interfaces and internal logics of train, testing, datasets, models, evaluation, and visualization. All the OpenMMLab 2.0 projects share the same design in those interfaces and logics to allow the emergence of multi-task/modality algorithms.

    3. Faster speed We optimize the training and inference speed for common models.

    4. New features:

      • Support TverskyLoss function
    5. More documentation and tutorials. We add a bunch of documentation and tutorials to help users get started more smoothly. Read it here.

    Breaking Changes

    We briefly list the major breaking changes here. We will update the migration guide to provide complete details and migration instructions.

    Training and testing

    • MMSeg 1.x runs on PyTorch>=1.6. We have deprecated the support of PyTorch 1.5 to embrace the mixed precision training and other new features since PyTorch 1.6. Some models can still run on PyTorch 1.5, but the full functionality of MMSeg 1.x is not guaranteed.

    • MMSeg 1.x uses Runner in MMEngine rather than that in MMCV. The new Runner implements and unifies the building logic of dataset, model, evaluation, and visualizer. Therefore, MMSeg 1.x no longer maintains the building logics of those modules in mmseg.train.apis and tools/train.py. Those code have been migrated into MMEngine. Please refer to the migration guide of Runner in MMEngine for more details.

    • The Runner in MMEngine also supports testing and validation. The testing scripts are also simplified, which has similar logic as that in training scripts to build the runner.

    • The execution points of hooks in the new Runner have been enriched to allow more flexible customization. Please refer to the migration guide of Hook in MMEngine for more details.

    • Learning rate and momentum scheduling has been migrated from Hook to Parameter Scheduler in MMEngine. Please refer to the migration guide of Parameter Scheduler in MMEngine for more details.

    Configs

    Components

    • Dataset
    • Data Transforms
    • Model
    • Evaluation
    • Visualization

    Improvements

    • Support mixed precision training of all the models. However, some models may got Nan results due to some numerical issues. We will update the documentation and list their results (accuracy of failure) of mixed precision training.

    Bug Fixes

    • Fix several config file errors #1994

    New Features

    1. Support data structures and encapsulating seg_logits in data samples, which can be return from models to support more common evaluation metrics.

    Ongoing changes

    1. Test-time augmentation: which is supported in MMSeg 0.x is not implemented in this version due to limited time slot. We will support it in the following releases with a new and simplified design.

    2. Inference interfaces: a unified inference interfaces will be supported in the future to ease the use of released models.

    3. Interfaces of useful tools that can be used in notebook: more useful tools that implemented in the tools directory will have their python interfaces so that they can be used through notebook and in downstream libraries.

    4. Documentation: we will add more design docs, tutorials, and migration guidance so that the community can deep dive into our new design, participate the future development, and smoothly migrate downstream libraries to MMSeg 1.x.

    Source code(tar.gz)
    Source code(zip)
  • v0.27.0(Jul 28, 2022)

    Changelog

    V0.27.0 (7/28/2022)

    Enhancement

    • Add Swin-L Transformer models (#1471)
    • Update ERFNet results (#1744)

    Bug Fixes

    Contributors

    • @DataSttructure made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1802
    • @AkideLiu made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1785
    • @mawanda-jun made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1761
    • @Yan-Daojiang made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1755
    Source code(tar.gz)
    Source code(zip)
  • v0.26.0(Jul 1, 2022)

    Highlights

    • Update New SegFormer models on ADE20K (1705)
    • Dedicated MMSegWandbHook for MMSegmentation (1603)

    New Features

    • Update New SegFormer models on ADE20K (1705)
    • Dedicated MMSegWandbHook for MMSegmentation (1603)
    • Add UPerNet r18 results (1669)

    Enhancement

    • Keep dimension of cls_token_weight for easier ONNX deployment (1642)
    • Support infererence with padding (1607)

    Bug Fixes

    Documentation

    • Fix mdformat version to support python3.6 and remove ruby installation (1672)

    New Contributors

    • @RunningLeon made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1642
    • @zhouzaida made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1655
    • @tkhe made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1667
    • @rotorliu made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1656
    • @EvelynWang-0423 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1679
    • @ZhaoYi1222 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1616
    • @Sanster made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1704
    • @ayulockin made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1603

    Full Changelog: https://github.com/open-mmlab/mmsegmentation/compare/v0.25.0...v0.26.0

    Source code(tar.gz)
    Source code(zip)
  • v0.25.0(Jun 2, 2022)

    What's Changed

    Highlights

    • Support PyTorch backend on MLU (1515)

    Bug Fixes

    • Fix the error of BCE loss when batch size is 1 (1629)
    • Fix bug of resize function when align_corners is True (1592)
    • Fix Dockerfile to run demo script in docker container (1568)
    • Correct inference_demo.ipynb path (1576)
    • Fix the build_segmentor in colab demo (1551)
    • Fix md2yml script (1633, 1555)
    • Fix main line link in MAE README.md (1556)
    • Fix fastfcn crop_size in README.md by (1597)
    • Pip upgrade when testing windows platform (1610)

    Improvements

    • Delete DS_Store file (1549)
    • Revise owners.yml (1621, 1534)

    Documentation

    • Rewrite the installation guidance (1630)
    • Format readme (1635)
    • Replace markdownlint with mdformat to avoid ruby installation (1591)
    • Add explanation and usage instructions for data configuration (1548)
    • Configure Myst-parser to parse anchor tag (1589)
    • Update QR code and link for QQ group (1598, 1574)

    Contributors

    • @atinfinity made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1568
    • @DoubleChuang made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1576
    • @alpha-baymax made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1515
    • @274869388 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1629

    Full Changelog: https://github.com/open-mmlab/mmsegmentation/compare/v0.24.1...v0.25.0

    Source code(tar.gz)
    Source code(zip)
  • v0.24.1(May 1, 2022)

  • v0.24.0(Apr 29, 2022)

    What's Changed

    Highlights

    • Support MAE: Masked Autoencoders Are Scalable Vision Learners
    • Support Resnet strikes back

    New Features

    • Support MAE: Masked Autoencoders Are Scalable Vision Learners (1307, 1523)
    • Support Resnet strikes back (1390)
    • Support extra dataloader settings in configs (1435)

    Bug Fixes

    • Fix input previous results for the last cascade_decode_head (#1450)
    • Fix validation loss logging (#1494)
    • Fix the bug in binary_cross_entropy (1527)
    • Support single channel prediction for Binary Cross Entropy Loss (#1454)
    • Fix potential bugs in accuracy.py (1496)
    • Avoid converting label ids twice by label map during evaluation (1417)
    • Fix bug about label_map (1445)
    • Fix image save path bug in Windows (1423)
    • Fix MMSegmentation Colab demo (1501, 1452)
    • Migrate azure blob for beit checkpoints (1503)
    • Fix bug in tools/analyse_logs.py caused by wrong plot_iter in some cases (1428)

    Improvements

    • Merge BEiT and ConvNext's LR decay optimizer constructors (#1438)
    • Register optimizer constructor with mmseg (#1456)
    • Refactor transformer encode layer in ViT and BEiT backbone (#1481)
    • Add build_pos_embed and build_layers for BEiT (1517)
    • Add with_cp to mit and vit (1431)
    • Fix inconsistent dtype of seg_label in stdc decode (1463)
    • Delete random seed for training in dist_train.sh (1519)
    • Revise high workers_per_gpus in config file (#1506)
    • Add GPG keys and del mmcv version in Dockerfile (1534)
    • Update checkpoint for model in deeplabv3plus (#1487)
    • Add DistSamplerSeedHook to set epoch number to dataloader when runner is EpochBasedRunner (1449)
    • Provide URLs of Swin Transformer pretrained models (1389)
    • Updating Dockerfiles From Docker Directory and get_started.md to reach latest stable version of Python, PyTorch and MMCV (1446)

    Documentation

    • Add more clearly statement of CPU training/inference (1518)

    New Contributors

    • @jiangyitong made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1431
    • @kahkeng made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1447
    • @Nourollah made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1446
    • @androbaza made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1452
    • @Yzichen made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1445
    • @whu-pzhang made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1423
    • @panfeng-hover made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1417
    • @Johnson-Wang made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1496
    • @jere357 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1460
    • @mfernezir made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1494
    • @donglixp made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1503
    • @YuanLiuuuuuu made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1307
    • @Dawn-bin made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1527

    Full Changelog: https://github.com/open-mmlab/mmsegmentation/compare/v0.23.0...v0.24.0

    Source code(tar.gz)
    Source code(zip)
  • v0.23.0(Apr 1, 2022)

    What's Changed

    Highlights

    • Support BEiT: BERT Pre-Training of Image Transformers
    • Support K-Net: Towards Unified Image Segmentation
    • Add avg_non_ignore of CELoss to support average loss over non-ignored elements
    • Support dataset initialization with file client

    New Features

    • Support BEiT: BERT Pre-Training of Image Transformers (#1404)
    • Support K-Net: Towards Unified Image Segmentation (#1289)
    • Support dataset initialization with file client (#1402)
    • Add class name function for STARE datasets (#1376)
    • Support different seeds on different ranks when distributed training (#1362)
    • Add nlc2nchw2nlc and nchw2nlc2nchw to simplify tensor with different dimension operation (#1249)

    Improvements

    • Synchronize random seed for distributed sampler (#1411)
    • Add script and documentation for multi-machine distributed training (#1383)

    Bug Fixes

    • Add avg_non_ignore of CELoss to support average loss over non-ignored elements (#1409)
    • Fix some wrong URLs of models or logs in ./configs (#1336)
    • Add title and color theme arguments to plot function in tools/confusion_matrix.py (#1401)
    • Fix outdated link in Colab demo (#1392)
    • Fix typos (#1424, #1405, #1371, #1366, #1363)

    Documentation

    • Add FAQ document (#1420)
    • Fix the config name style description in official docs(#1414)

    New Contributors

    • @kinglintianxia made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1371
    • @CCODING04 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1376
    • @mob5566 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1401
    • @xiongnemo made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1392
    • @Xiangxu-0103 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1405
    Source code(tar.gz)
    Source code(zip)
  • v0.22.1(Mar 9, 2022)

  • v0.22.0(Mar 4, 2022)

    Highlights

    • Support ConvNeXt: A ConvNet for the 2020s. Please use the latest MMClassification (0.21.0) to try it out.
    • Support iSAID aerial Dataset.
    • Officially Support inference on Windows OS.

    New Features

    • Support ConvNeXt: A ConvNet for the 2020s. (#1216)
    • Support iSAID aerial Dataset. (#1115
    • Generating and plotting confusion matrix. (#1301)

    Improvements

    • Refactor 4 decoder heads (ASPP, FCN, PSP, UPer): Split forward function into _forward_feature and cls_seg. (#1299)
    • Add min_size arg in Resize to keep the shape after resize bigger than slide window. (#1318)
    • Revise pre-commit-hooks. (#1315)
    • Add win-ci. (#1296)

    Bug Fixes

    • Fix mlp_ratio type in Swin Transformer. (#1274)
    • Fix path errors in ./demo . (#1269)
    • Fix bug in conversion of potsdam. (#1279)
    • Make accuracy take into account ignore_index. (#1259)
    • Add Pytorch HardSwish assertion in unit test. (#1294)
    • Fix wrong palette value in vaihingen. (#1292)
    • Fix the bug that SETR cannot load pretrain. (#1293)
    • Update correct In Collection in metafile of each configs. (#1239)
    • Upload completed STDC models. (#1332)
    • Fix DNLHead exports onnx inference difference type Cast error. (#1161)

    Contributors

    • @JiaYanhao made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1269
    • @andife made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1281
    • @SBCV made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1279
    • @HJoonKwon made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1259
    • @Tsingularity made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1290
    • @Waterman0524 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1115
    • @MeowZheng made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1315
    • @linfangjian01 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1318
    Source code(tar.gz)
    Source code(zip)
  • v0.21.1(Feb 9, 2022)

    Bug Fixes

    • Fix repeating log by setup_multi_processes. (#1267)
    • Fix typos in docs. (#1263)
    • Upgrade isort in pre-commit hook. (#1270)

    Improvements

    • Use MMCV load_state_dict function in ViT/Swin. (#1272)
    • Add exception for PointRend for support CPU-only. (#1271)

    New Contributors

    • @RangeKing made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1263
    Source code(tar.gz)
    Source code(zip)
  • v0.21.0(Jan 29, 2022)

    Highlights

    • Officially Support CPUs training and inference, please use the latest MMCV (1.4.4) to try it out.
    • Support Segmenter: Transformer for Semantic Segmentation (ICCV'2021).
    • Support ISPRS Potsdam and Vaihingen Dataset.
    • Add Mosaic transform and MultiImageMixDataset class in dataset_wrappers.

    New Features

    • Support Segmenter: Transformer for Semantic Segmentation (ICCV'2021) (#955)
    • Support ISPRS Potsdam and Vaihingen Dataset (#1097, #1171)
    • Add segformer‘s benchmark on cityscapes (#1155)
    • Add auto resume (#1172)
    • Add Mosaic transform and MultiImageMixDataset class in dataset_wrappers (#1093, #1105)
    • Add log collector (#1175)

    Improvements

    • New-style CPU training and inference (#1251)
    • Add UNet benchmark with multiple losses supervision (#1143)

    Bug Fixes

    • Fix the model statistics in doc for readthedoc (#1153)
    • Set random seed for palette if not given (#1152)
    • Add COCOStuffDataset in class_names.py (#1222)
    • Fix bug in non-distributed multi-gpu training/testing (#1247)
    • Delete unnecessary lines of STDCHead (#1231)

    New Contributors

    • @jbwang1997 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1152
    • @BeaverCC made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1206
    • @Echo-minn made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1214
    • @rstrudel made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/955
    Source code(tar.gz)
    Source code(zip)
  • v0.20.2(Dec 15, 2021)

    What's Changed

    • [Fix] Revise --option to --options in https://github.com/open-mmlab/mmsegmentation/pull/1140.

    Publish this version is to avoid BC-Breaking problem caused by v0.20.1.

    Contributors: @RockeyCoss

    Source code(tar.gz)
    Source code(zip)
  • v0.20.1(Dec 14, 2021)

  • v0.20.0(Dec 10, 2021)

    Highlights

    • Support Twins (#989)
    • Support a real-time segmentation model STDC (#995)
    • Support a widely-used segmentation model in lane detection ERFNet (#960)
    • Support A Remote Sensing Land-Cover Dataset LoveDA (#1028)
    • Support focal loss (#1024)

    New Features

    • Support Twins (#989)
    • Support a real-time segmentation model STDC (#995)
    • Support a widely-used segmentation model in lane detection ERFNet (#960)
    • Add SETR cityscapes benchmark (#1087)
    • Add BiSeNetV1 COCO-Stuff 164k benchmark (#1019)
    • Support focal loss (#1024)
    • Add Cutout transform (#1022)

    Improvements

    • Set a random seed when the user does not set a seed (#1039)
    • Add CircleCI setup (#1086)
    • Skip CI on ignoring given paths (#1078)
    • Add abstract and image for every paper (#1060)
    • Create a symbolic link on windows (#1090)
    • Support video demo using trained model (#1014)

    Bug Fixes

    • Fix incorrectly loading init_cfg or pretrained models of several transformer models (#999, #1069, #1102)
    • Fix EfficientMultiheadAttention in SegFormer (#1003)
    • Remove fp16 folder in configs (#1031)
    • Fix several typos in .yml file (Dice Metric #1041, ADE20K dataset #1120, Training Memory (GB) #1083)
    • Fix test error when using --show-dir (#1091)
    • Fix dist training infinite waiting issue (#1035)
    • Change the upper version of mmcv to 1.5.0 (#1096)
    • Fix symlink failure on Windows (#1038)
    • Cancel previous runs that are not completed (#1118)
    • Unified links of readthedocs in docs (#1119)

    Contributors

    • @Junjue-Wang made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1028
    • @ddebby made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1066
    • @del-zhenwu made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1078
    • @KangBK0120 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1106
    • @zergzzlun made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1091
    • @fingertap made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1035
    • @irvingzhang0512 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1014
    • @littleSunlxy made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/989
    • @lkm2835
    • @RockeyCoss
    • @MengzhangLI
    • @Junjun2016
    • @xiexinch
    • @xvjiarui
    Source code(tar.gz)
    Source code(zip)
  • v0.19.0(Nov 2, 2021)

    Highlights

    • Support TIMMBackbone wrapper (#998)
    • Support custom hook (#428)
    • Add codespell pre-commit hook (#920)
    • Add FastFCN benchmark on ADE20K (#972)

    New Features

    • Support TIMMBackbone wrapper (#998)
    • Support custom hook (#428)
    • Add FastFCN benchmark on ADE20K (#972)
    • Add codespell pre-commit hook and fix typos (#920)

    Improvements

    • Make inputs & channels smaller in unittests (#1004)
    • Change self.loss_decode back to dict in Single Loss situation (#1002)

    Bug Fixes

    • Fix typo in usage example (#1003)
    • Add contiguous after permutation in ViT (#992)
    • Fix the invalid link (#985)
    • Fix bug in CI with python 3.9 (#994)
    • Fix bug when loading class name form file in custom dataset (#923)

    Contributors

    • @ShoupingShan made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/923
    • @RockeyCoss made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/954
    • @HarborYuan made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/992
    • @lkm2835 made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/1003
    • @gszh made their first contribution in https://github.com/open-mmlab/mmsegmentation/pull/428
    • @xvjiarui
    • @VVsssssk
    • @MengzhangLI
    • @Junjun2016
    Source code(tar.gz)
    Source code(zip)
  • v0.18.0(Oct 7, 2021)

    Highlights

    • Support three real-time segmentation models (ICNet #884, BiSeNetV1 #851, and BiSeNetV2 #804)
    • Support one efficient segmentation model (FastFCN #885)
    • Support one efficient non-local/self-attention based segmentation model (ISANet #70)
    • Support COCO-Stuff 10k and 164k datasets (#625)
    • Support evaluate concated dataset separately (#833)
    • Support loading GT for evaluation from multi-file backend (#867)

    New Features

    • Support three real-time segmentation models (ICNet #884, BiSeNetV1 #851, and BiSeNetV2 #804)
    • Support one efficient segmentation model (FastFCN #885)
    • Support one efficient non-local/self-attention based segmentation model (ISANet #70)
    • Support COCO-Stuff 10k and 164k datasets (#625)
    • Support evaluate concated dataset separately (#833)

    Improvements

    • Support loading GT for evaluation from multi-file backend (#867)
    • Auto-convert SyncBN to BN when training on DP automatly(#772)
    • Refactor Swin-Transformer (#800)

    Bug Fixes

    • Update mmcv installation in dockerfile (#860)
    • Fix number of iteration bug when resuming checkpoint in distributed train (#866)
    • Fix parsing parse in val_step (#906)
    Source code(tar.gz)
    Source code(zip)
  • v0.17.0(Sep 1, 2021)

    Highlights

    • Support SegFormer
    • Support DPT
    • Support Dark Zurich and Nighttime Driving datasets
    • Support progressive evaluation

    New Features

    • Support SegFormer (#599)
    • Support DPT (#605)
    • Support Dark Zurich and Nighttime Driving datasets (#815)
    • Support progressive evaluation (#709)

    Improvements

    • Add multiscale_output interface and unittests for HRNet (#830)
    • Support inherit cityscapes dataset (#750)
    • Fix some typos in README.md (#824)
    • Delete convert function and add instruction to ViT/Swin README.md (#791)
    • Add vit/swin/mit convert weight scripts (#783)
    • Add copyright files (#796)

    Bug Fixes

    • Fix invalid checkpoint link in inference_demo.ipynb (#814)
    • Ensure that items in dataset have the same order across multi machine (#780)
    • Fix the log error (#766)
    Source code(tar.gz)
    Source code(zip)
  • v0.16.0(Aug 4, 2021)

    Highlights

    • Support PyTorch 1.9
    • Support SegFormer backbone MiT
    • Support md2yml pre-commit hook
    • Support frozen stage for HRNet

    New Features

    • Support SegFormer backbone MiT (#594)
    • Support md2yml pre-commit hook (#732)
    • Support mim (#717)
    • Add mmseg2torchserve tool (#552)

    Improvements

    • Support hrnet frozen stage (#743)
    • Add template of reimplementation questions (#741)
    • Output pdf and epub formats for readthedocs (#742)
    • Refine the docstring of ResNet (#723)
    • Replace interpolate with resize (#731)
    • Update resource limit (#700)
    • Update config.md (#678)

    Bug Fixes

    • Fix ATTENTION registry (#729)
    • Fix analyze log script (#716)
    • Fix doc api display (#725)
    • Fix patch_embed and pos_embed mismatch error (#685)
    • Fix efficient test for multi-node (#707)
    • Fix init_cfg in resnet backbone (#697)
    • Fix efficient test bug (#702)
    • Fix url error in config docs (#680)
    • Fix mmcv installation (#676)
    • Fix torch version (#670)

    Contributors

    @sshuair @xiexinch @Junjun2016 @mmeendez8 @xvjiarui @sennnnn @puhsu @BIGWangYuDong @keke1u @daavoo

    Source code(tar.gz)
    Source code(zip)
  • v0.15.0(Jul 4, 2021)

    Highlights

    • Support ViT, SETR, and Swin-Transformer
    • Add Chinese documentation
    • Unified parameter initialization

    Bug Fixes

    • Fix typo and links (#608)
    • Fix Dockerfile (#607)
    • Fix ViT init (#609)
    • Fix mmcv version compatible table (#658)
    • Fix model links of DMNet and UNet (#660)

    New Features

    • Support loading DeiT weights (#538)
    • Support SETR (#531, #635)
    • Add config and models for ViT backbone with UperHead (#520, #635)
    • Support Swin-Transformer (#511)
    • Add higher accuracy FastSCNN (#606)
    • Add Chinese documentation (#666)

    Improvements

    • Unified parameter initialization (#567)
    • Separate CUDA and CPU in github action CI (#602)
    • Support persistent dataloader worker (#646)
    • Update meta file fields (#661, #664)
    Source code(tar.gz)
    Source code(zip)
  • v0.14.0(Jun 3, 2021)

    Highlights

    • Support ONNX to TensorRT
    • Support MIM

    Bug Fixes

    • Fix ONNX to TensorRT verify (#547)
    • Fix save best for EvalHook (#575)

    New Features

    • Support loading DeiT weights (#538)
    • Support ONNX to TensorRT (#542)
    • Support output results for ADE20k (#544)
    • Support MIM (#549)

    Improvements

    • Add option for ViT output shape (#530)
    • Infer batch size using len(result) (#532)
    • Add compatible table between MMSeg and MMCV (#558)
    Source code(tar.gz)
    Source code(zip)
  • v0.13.0(May 5, 2021)

    Highlights

    • Support Pascal Context Class-59 dataset.
    • Support Visual Transformer Backbone.
    • Support mFscore metric.

    Bug Fixes

    • Fixed Colaboratory tutorial (#451)
    • Fixed mIoU calculation range (#471)
    • Fixed sem_fpn, unet README.md (#492)
    • Fixed num_classes in FCN for Pascal Context 60-class dataset (#488)
    • Fixed FP16 inference (#497)

    New Features

    • Support dynamic export and visualize to pytorch2onnx (#463)
    • Support export to torchscript (#469, #499)
    • Support Pascal Context Class-59 dataset (#459)
    • Support Visual Transformer backbone (#465)
    • Support UpSample Neck (#512)
    • Support mFscore metric (#509)

    Improvements

    • Add more CI for PyTorch (#460)
    • Add print model graph args for tools/print_config.py (#451)
    • Add cfg links in modelzoo README.md (#468)
    • Add BaseSegmentor import to segmentors/init.py (#495)
    • Add MMOCR, MMGeneration links (#501, #506)
    • Add Chinese QR code (#506)
    • Use MMCV MODEL_REGISTRY (#515)
    • Add ONNX testing tools (#498)
    • Replace data_dict calling 'img' key to support MMDet3D (#514)
    • Support reading class_weight from file in loss function (#513)
    • Make tags as comment (#505)
    • Use MMCV EvalHook (#438)
    Source code(tar.gz)
    Source code(zip)
  • v0.12.0(Apr 4, 2021)

    Highlights

    • Support FCN-Dilate 6 model.
    • Support Dice Loss.

    Bug Fixes

    • Fixed PhotoMetricDistortion Doc (#388)
    • Fixed install scripts (#399)
    • Fixed Dice Loss multi-class (#417)

    New Features

    • Support Dice Loss (#396)
    • Add plot logs tool (#426)
    • Add opacity option to show_result (#425)
    • Speed up mIoU metric (#430)

    Improvements

    • Refactor unittest file structure (#440)
    • Fix typos in the repo (#449)
    • Include class-level metrics in the log (#445)
    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Feb 2, 2021)

    Highlights

    • Support memory efficient test, add more UNet models.

    Bug Fixes

    • Fixed TTA resize scale (#334)
    • Fixed CI for pip 20.3 (#307)
    • Fixed ADE20k test (#359)

    New Features

    • Support memory efficient test (#330)
    • Add more UNet benchmarks (#324)
    • Support Lovasz Loss (#351)

    Improvements

    • Move train_cfg/test_cfg inside model (#341)
    Source code(tar.gz)
    Source code(zip)
  • v0.10.0(Jan 2, 2021)

    Highlights

    • Support MobileNetV3, DMNet, APCNet. Add models of ResNet18V1b, ResNet18V1c, ResNet50V1b, ResNet101V1b.

    Bug Fixes

    • Fixed CPU TTA (#276)
    • Fixed CI for pip 20.3 (#307)

    New Features

    • Add ResNet18V1b, ResNet18V1c, ResNet50V1b models (#316)
    • Support MobileNetV3 (#268)
    • Add 4 retinal vessel segmentation benchmark (#315)
    • Support DMNet (#313)
    • Support APCNet (#299)

    Improvements

    • Refactor Documentation page (#311)
    • Support resize data augmentation according to original image size (#291)
    Source code(tar.gz)
    Source code(zip)
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

RAVE: Realtime Audio Variational autoEncoder Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthes

ACIDS 587 Jan 01, 2023
Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

UTNet (Accepted at MICCAI 2021) Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation Introduction Transf

110 Jan 01, 2023
Scripts and outputs related to the paper Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings.

Knowledge Graph Embeddings and Chemical Effect Prediction, 2020. Scripts and outputs related to the paper Prediction of Adverse Biological Effects of

Knowledge Graphs at the Norwegian Institute for Water Research 1 Nov 01, 2021
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 04, 2022
The dynamics of representation learning in shallow, non-linear autoencoders

The dynamics of representation learning in shallow, non-linear autoencoders The package is written in python and uses the pytorch implementation to ML

Maria Refinetti 4 Jun 08, 2022
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

Sihyun Yu 147 Dec 31, 2022
The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue.

The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue. How do I cite D-REX? For now, cite

Alon Albalak 6 Mar 31, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
AWS documentation corpus for zero-shot open-book question answering.

aws-documentation We present the AWS documentation corpus, an open-book QA dataset, which contains 25,175 documents along with 100 matched questions a

Sia Gholami 2 Jul 07, 2022
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

TensorFlow-Image-Models Introduction Usage Models Profiling License Introduction TensorfFlow-Image-Models (tfimm) is a collection of image models with

Martins Bruveris 227 Dec 20, 2022
ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS. It currently supports four examples for you to quickly experience the power of ONNX Runti

Microsoft 58 Dec 18, 2022
Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

Transferable Unrestricted Adversarial Examples This is the PyTorch implementation of the Arxiv paper: Towards Transferable Unrestricted Adversarial Ex

equation 16 Dec 29, 2022
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

Yue Tan 21 Nov 24, 2022
An easy-to-use app to visualise attentions of various VQA models.

Ask Me Anything: A tool for visualising Visual Question Answering (AMA) An easy-to-use app to visualise attentions of various VQA models. Please click

Apoorve 37 Nov 13, 2022
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

Wearables Development Toolkit (WDK) The Wearables Development Toolkit (WDK) is a framework and set of tools to facilitate the iterative development of

Juan Haladjian 114 Nov 27, 2022
Style transfer between images was performed using the VGG19 model

Style transfer between images was performed using the VGG19 model. The necessary codes, libraries and all other information of this project are available below

Onur yılmaz 2 May 09, 2022
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP 📎 A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

Cade Gordon 396 Dec 30, 2022
3D-printable hand-strapped keyboard

Note: This repo has not been cleaned up and prepared for general consumption at all. This is just a dump of the project files. If there is any interes

Wojciech Baranowski 41 Dec 31, 2022