The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

Overview

miseval: a metric library for Medical Image Segmentation EVALuation

shield_python shield_build shield_pypi_version shield_pypi_downloads shield_license

The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure. We hope that our this will help improve evaluation quality, reproducibility, and comparability in future studies in the field of medical image segmentation.

Guideline on Evaluation Metrics for Medical Image Segmentation

  1. Use DSC as main metric for validation and performance interpretation.
  2. Use AHD for interpretation on point position sensitivity (contour) if needed.
  3. Avoid any interpretations based on high pixel accuracy scores.
  4. Provide next to DSC also IoU, Sensitivity, and Specificity for method comparability.
  5. Provide sample visualizations, comparing the annotated and predicted segmentation, for visual evaluation as well as to avoid statistical bias.
  6. Avoid cherry-picking high-scoring samples.
  7. Provide histograms or box plots showing the scoring distribution across the dataset.
  8. For multi-class problems, provide metric computations for each class individually.
  9. Avoid confirmation bias through macro-averaging classes which is pushing scores via background class inclusion.
  10. Provide access to evaluation scripts and results with journal data services or third-party services like GitHub and Zenodo for easier reproducibility.

Implemented Metrics

Metric Index in miseval Function in miseval
Dice Similarity Index "DSC", "Dice", "DiceSimilarityCoefficient" miseval.calc_DSC()
Intersection-Over-Union "IoU", "Jaccard", "IntersectionOverUnion" miseval.calc_IoU()
Sensitivity "SENS", "Sensitivity", "Recall", "TPR", "TruePositiveRate" miseval.calc_Sensitivity()
Specificity "SPEC", "Specificity", "TNR", "TrueNegativeRate" miseval.calc_Specificity()
Precision "PREC", "Precision" miseval.calc_Precision()
Accuracy "ACC", "Accuracy", "RI", "RandIndex" miseval.calc_Accuracy()
Balanced Accuracy "BACC", "BalancedAccuracy" miseval.calc_BalancedAccuracy()
Adjusted Rand Index "ARI", "AdjustedRandIndex" miseval.calc_AdjustedRandIndex()
AUC "AUC", "AUC_trapezoid" miseval.calc_AUC()
Cohen's Kappa "KAP", "Kappa", "CohensKappa" miseval.calc_Kappa()
Hausdorff Distance "HD", "HausdorffDistance" miseval.calc_SimpleHausdorffDistance()
Average Hausdorff Distance "AHD", "AverageHausdorffDistance" miseval.calc_AverageHausdorffDistance()
Volumetric Similarity "VS", "VolumetricSimilarity" miseval.calc_VolumetricSimilarity()
True Positive "TP", "TruePositive" miseval.calc_TruePositive()
False Positive "FP", "FalsePositive" miseval.calc_FalsePositive()
True Negative "TN", "TrueNegative" miseval.calc_TrueNegative()
False Negative "FN", "FalseNegative" miseval.calc_FalseNegative()

How to Use

Example

# load libraries
import numpy as np
from miseval import evaluate

# Get some ground truth / annotated segmentations
np.random.seed(1)
real_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
real_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)
# Get some predicted segmentations
np.random.seed(2)
pred_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
pred_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)

# Run binary evaluation
dice = evaluate(real_bi, pred_bi, metric="DSC")    
  # returns single np.float64 e.g. 0.75

# Run multi-class evaluation
dice_list = evaluate(real_mc, pred_mc, metric="DSC", multi_class=True,
                     n_classes=5)   
  # returns array of np.float64 e.g. [0.9, 0.2, 0.6, 0.0, 0.4]
  # for each class, one score

Core function: Evaluate()

Every metric in miseval can be called via our core function evaluate().

The miseval eavluate function can be run with different metrics as backbone.
You can pass the following options to the metric parameter:

  • String naming one of the metric labels, for example "DSC"
  • Directly passing a metric function, for example calc_DSC_Sets (from dice.py)
  • Passing a custom metric function

List of metrics : See miseval/__init__.py under section "Access Functions to Metric Functions"

The classes in a segmentation mask must be ongoing starting from 0 (integers from 0 to n_classes-1).

A segmentation mask is allowed to have either no channel axis or just 1 (e.g. 512x512x1), which contains the annotation.

Binary mode. n_classes (Integer): Number of classes. By default 2 -> Binary Output: score (Float) or scores (List of Float) The multi_class parameter defines the output of this function. If n_classes > 2, multi_class is automatically True. If multi_class == False & n_classes == 2, only a single score (float) is returned. If multi_class == True, multiple scores as a list are returned (for each class one score). """ def evaluate(truth, pred, metric, multi_class=False, n_classes=2)">
"""
Arguments:
    truth (NumPy Matrix):            Ground Truth segmentation mask.
    pred (NumPy Matrix):             Prediction segmentation mask.
    metric (String or Function):     Metric function. Either a function directly or encoded as String from miseval or a custom function.
    multi_class (Boolean):           Boolean parameter, if segmentation is a binary or multi-class problem. By default False -> Binary mode.
    n_classes (Integer):             Number of classes. By default 2 -> Binary

Output:
    score (Float) or scores (List of Float)

    The multi_class parameter defines the output of this function.
    If n_classes > 2, multi_class is automatically True.
    If multi_class == False & n_classes == 2, only a single score (float) is returned.
    If multi_class == True, multiple scores as a list are returned (for each class one score).
"""
def evaluate(truth, pred, metric, multi_class=False, n_classes=2)

Installation

  • Install miseval from PyPI (recommended):
pip install miseval
  • Alternatively: install miseval from the GitHub source:

First, clone miseval using git:

git clone https://github.com/frankkramer-lab/miseval

Then, go into the miseval folder and run the install command:

cd miseval
python setup.py install

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer. (2022)
MISeval: a Metric Library for Medical Image Segmentation Evaluation.
arXiv e-print: https://arxiv.org/abs/2201.09395

@inproceedings{misevalMUELLER2022,
  title={MISeval: a Metric Library for Medical Image Segmentation Evaluation},
  author={Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer},
  year={2022}
  eprint={2201.09395},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

REMIND Your Neural Network to Prevent Catastrophic Forgetting This is a PyTorch implementation of the REMIND algorithm from our ECCV-2020 paper. An ar

Tyler Hayes 72 Nov 27, 2022
Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera

67 Dec 05, 2022
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Scalable Cluster-Consistency Statistics for Robust Multi-Object Matching (3DV 2021 Oral Presentation) Filtering by Cluster Consistency (FCC) is a very

Yunpeng Shi 11 Sep 28, 2022
Sequential GCN for Active Learning

Sequential GCN for Active Learning Please cite if using the code: Link to paper. Requirements: python 3.6+ torch 1.0+ pip libraries: tqdm, sklearn, sc

45 Dec 26, 2022
Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv. Obviously only use it in offline mode as it is against the TOS of Blizzard to use it in online mode!

468 Jan 03, 2023
MIM: MIM Installs OpenMMLab Packages

MIM provides a unified API for launching and installing OpenMMLab projects and their extensions, and managing the OpenMMLab model zoo.

OpenMMLab 254 Jan 04, 2023
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Status: Under development (expect bug fixes and huge updates) ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectiv

37 Dec 28, 2022
My personal Home Assistant configuration.

About This is my personal Home Assistant configuration. My guiding princile is to have full local control of all my devices. I intend everything to ru

Chris Turra 13 Jun 07, 2022
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

229 Dec 13, 2022
Code for "Long-tailed Distribution Adaptation"

Long-tailed Distribution Adaptation (Accepted in ACM MM2021) This project is built upon BBN. Installation pip install -r requirements.txt Usage Traini

Zhiliang Peng 10 May 18, 2022
PyTorchVideo is a deeplearning library with a focus on video understanding work

PyTorchVideo is a deeplearning library with a focus on video understanding work. PytorchVideo provides resusable, modular and efficient components needed to accelerate the video understanding researc

Facebook Research 2.7k Jan 07, 2023
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

ImageBART NeurIPS 2021 Patrick Esser*, Robin Rombach*, Andreas Blattmann*, Björn Ommer * equal contribution arXiv | BibTeX | Poster Requirements A sui

CompVis Heidelberg 110 Jan 01, 2023
GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

GraPE GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both of

AnacletoLab 194 Dec 29, 2022
A curated list of resources for Image and Video Deblurring

A curated list of resources for Image and Video Deblurring

Subeesh Vasu 1.7k Jan 01, 2023
This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Locus This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order

Robotics and Autonomous Systems Group 96 Dec 15, 2022
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
Outlier Exposure with Confidence Control for Out-of-Distribution Detection

OOD-detection-using-OECC This repository contains the essential code for the paper Outlier Exposure with Confidence Control for Out-of-Distribution De

Nazim Shaikh 64 Nov 02, 2022
Code & Experiments for "LILA: Language-Informed Latent Actions" to be presented at the Conference on Robot Learning (CoRL) 2021.

LILA LILA: Language-Informed Latent Actions Code and Experiments for Language-Informed Latent Actions (LILA), for using natural language to guide assi

Sidd Karamcheti 11 Nov 25, 2022