An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

Overview

Linux CI

Creative Commons License

This is the code for the paper:

MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF]
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun
Presented at CVPR 2020. Link to MSeg Video (3min)

NEWS:

  • [Dec. 2021]: An updated journal-length version of our work is now available on ArXiv here.

This repo is the first of 4 repos that introduce our work. It provides utilities to download the MSeg dataset (which is nontrivial), and prepare the data on disk in a unified taxonomy.

Three additional repos are also provided:

  • mseg-semantic: provides HRNet-W48 Training (sufficient to train a winning entry on the WildDash benchmark)
  • mseg-panoptic: provides Panoptic-FPN and Mask-RCNN training, based on Detectron2 (will be introduced in January 2021)
  • mseg-mturk: utilities to perform large-scale Mechanical Turk re-labeling

Install the MSeg module:

  • mseg can be installed as a python package using

      pip install -e /path_to_root_directory_of_the_repo/
    

Make sure that you can run import mseg in python, and you are good to go!

Download MSeg

The MSeg Taxonomy

We provide comprehensive class definitions and examples here. We provide here a master spreadsheet mapping all training datasets to the MSeg Taxonomy, and the MSeg Taxonomy to test datasets. Please consult taxonomy_FAQ.md to learn what each of the dataset taxonomy names means.

Citing MSeg

If you find this code useful for your research, please cite:

@InProceedings{MSeg_2020_CVPR,
author = {Lambert, John and Liu, Zhuang and Sener, Ozan and Hays, James and Koltun, Vladlen},
title = {{MSeg}: A Composite Dataset for Multi-domain Semantic Segmentation},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}

Repo Structure

  • download_scripts: code and instructions to download the entire MSeg dataset
  • mseg: Python module, including
    • dataset_apis
    • dataset_lists: ordered classnames for each dataset, and corresponding relative rgb/label file paths
    • label_preparation: code for remapping to semseg format, and for relabeling masks in place
    • relabeled_data: MSeg data, annotated by Mechanical Turk workers, and verified by co-authors
    • taxonomy: on-the-fly mapping to a unified taxonomy during training, and linear mapping to evaluation taxonomies
    • utils: library functions for mask and image manipulation, filesystem, tsv/csv reading, and multiprocessing
  • tests: unit tests on all code

Data License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Frequently Asked Questions (FAQ)

Q: Do the weights include the model structure or it's just the weights? If the latter, which model do these weights refer to? Under the models directory, there are several model implementations.

A: The pre-trained models follow the HRNet-W48 architecture. The model structure is defined in the code here. The saved weights provide a dictionary between keys (unique IDs for each weight identifying the corresponding layer/layer type) and values (the floating point weights).

Q: How is testing performed on the test datasets? In the paper you talk about "zero-shot transfer" -- how this is performed? Are the test dataset labels also mapped or included in the unified taxonomy? If you remapped the test dataset labels to the unified taxonomy, are the reported results the performances on the unified label space, or on each test dataset's original label space? How did you you obtain results on the WildDash dataset - which is evaluated by the server - when the MSeg taxonomy may be different from the WildDash dataset.

A: Regarding "zero-shot transfer", please refer to section "Using the MSeg taxonomy on a held-out dataset" on page 6 of our paper. This section describes how we hand-specify mappings from the unified taxonomy to each test dataset's taxonomy as a linear mapping (implemented here in mseg-api). All results are in the test dataset's original label space (i.e. if WildDash expects class indices in the range [0,18] per our names_list, our testing script uses the TaxonomyConverter transform_predictions_test() functionality to produce indices in that range, remapping probabilities.

Q: Why don't indices in MSeg_master.tsv match the training indices in individual datasets? For example, for the road class: In idd-39, road has index 0, but in idd-39-relabeled, road has index 19. It is index 7 in cityscapes-34. The cityscapes-19-relabeled index road is 11. As far as I can tell, ultimately the 'MSeg_Master.tsv' file provides the final mapping to the MSeg label space. But here, the road class seems to have an index of 98, which is neither 19 nor 11.

A: Indeed, unified taxonomy class index 98 represents "road". But we use the TaxonomyConverter to accomplish the mapping on the fly from idd-39-relabeled to the unified/universal taxonomy (we use the terms "unified" and "universal" interchangeably). This is done by adding a transform in the training loop that calls TaxonomyConverter.transform_label() on the fly. You can see how that transform is implemented here in mseg-semantic.

Q: When testing, but there are test classes that are not in the unified taxonomy (e.g. Parking, railtrack, bridge etc. in WildDash), how do you produce predictions for that class? I understand you map the predictions with a binary matrix. But what do you do when there's no one-to-one correspondence?

A: WildDash v1 uses the 19-class taxonomy for evaluation, just like Cityscapes. So we use the following script to remap the 34-class taxonomy to 19-class taxonomy for WildDash for testing inference and submission. You can see how Cityscapes evaluates just 19 of the 34 classes here in the evaluation script and in the taxonomy definition. However, bridge and rail track are actually included in our unified taxonomy, as you’ll see in MSeg_master.tsv.

Q: How are datasets images read in for training/inference? Should I use the dataset_apis from mseg-api?

A: The dataset_apis from mseg-api are not for training or inference. They are purely for generating the MSeg dataset labels on disk. We read in the datasets using mseg_semantic/utils/dataset.py and then remap them to the universal space on the fly.

NeROIC: Neural Object Capture and Rendering from Online Image Collections

NeROIC: Neural Object Capture and Rendering from Online Image Collections This repository is for the source code for the paper NeROIC: Neural Object C

Snap Research 647 Dec 27, 2022
Colab notebook for openai/glide-text2im.

GLIDE text2im on Colab This repository provides a Colab notebook to produce images conditioned on text prompts with GLIDE [1]. Usage Run text2im.ipynb

Wok 19 Oct 19, 2022
Do Neural Networks for Segmentation Understand Insideness?

This is part of the code to reproduce the results of the paper Do Neural Networks for Segmentation Understand Insideness? [pdf] by K. Villalobos (*),

biolins 0 Mar 20, 2021
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Kento Nishi 22 Jul 07, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 309 Dec 30, 2022
OoD Minimum Anomaly Score GAN - Code for the Paper 'OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary'

OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary Out-of-Distribution Minimum Anomaly Score GAN (OMASGAN) C

- 8 Sep 27, 2022
Neon: an add-on for Lightbulb making it easier to handle component interactions

Neon Neon is an add-on for Lightbulb making it easier to handle component interactions. Installation pip install git+https://github.com/neonjonn/light

Neon Jonn 9 Apr 29, 2022
Help you understand Manual and w/ Clutch point while driving.

简体中文 forza_auto_gear forza_auto_gear is a tool for Forza Horizon 5. It will help us understand the best gear shift point using Manual or w/ Clutch in

15 Oct 08, 2022
This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit

BMW Semantic Segmentation GPU/CPU Inference API This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit. The train

BMW TechOffice MUNICH 56 Nov 24, 2022
DLFlow is a deep learning framework.

DLFlow是一套深度学习pipeline,它结合了Spark的大规模特征处理能力和Tensorflow模型构建能力。利用DLFlow可以快速处理原始特征、训练模型并进行大规模分布式预测,十分适合离线环境下的生产任务。利用DLFlow,用户只需专注于模型开发,而无需关心原始特征处理、pipeline构建、生产部署等工作。

DiDi 152 Oct 27, 2022
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

Google 1k Jan 09, 2023
GPU-accelerated Image Processing library using OpenCL

pyclesperanto pyclesperanto is a python package for clEsperanto - a multi-language framework for GPU-accelerated image processing. clEsperanto uses Op

17 Dec 25, 2022
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
Blender scripts for computing geodesic distance

GeoDoodle Geodesic distance computation for Blender meshes Table of Contents Overivew Usage Implementation Overview This addon provides an operator fo

20 Jun 08, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 08, 2022
smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

smc.covid smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectiou

0 Oct 15, 2021
scalingscattering

Scaling The Scattering Transform : Deep Hybrid Networks This repository contains the experiments found in the paper: https://arxiv.org/abs/1703.08961

Edouard Oyallon 78 Dec 21, 2022