CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Last update: Oct 11, 2022

Related tags

Overview

CSAW-M

This repository contains code for CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer. Source code for training models to estimate the mammographic masking level along with the checkpoints are made available here.
The repo containing the annotation tool developed to annotate CSAW-M could be found here. The dataset could be found here.

Training and evaluation

In order to train a model, please refer to scripts/train.sh where we have prepared commands and arguments to train a model. In order to encourage reproducibility, we also provide the cross-validation splits that we used in the project (please refer to the dataset website to access them). scripts/cross_val.sh provides example commands to run cross-validation.
In order to evaluate a trained model, please refer to scripts/eval.sh with example commands and arguments to evaluate a model.
Checkpoints could be downloaded from here.

Important arguments defined in in the main module

--train and --evaluate which should be used in training and evaluating models respectively.
--model_name: specifies the model name, which will then be used for saving/loading checkpoints
--loss_type: defines which loss type to train the model with. It could be either one_hot which means training the model in a multi-class setup under usual cross entropy loss, or multi_hot which means training the model in a multi-label setup using multi-hot encoding (defined for ordinal labels). Please refer to paper for more details.
--img_size: specifies the image size to train the model with.
Almost all the params in params.yml could be overridden using the corresponding arguments. Please refer to main.py to see the corresponding args.

Other notes

It is assumed that main.py is called from inside the src directory.
It is important to note that in the beginning of the main script, after reading/checking arguments, params defined in params.ymlis read and updated according to args, after which a call to the set_globals (defined in main.py) is made. This sets global params needed to run the program (GPU device, loggers etc.) For every new high-level module (like main.py) that accepts running arguments and calls other modules, this function shoud be called, as other modules assume that these global params are set.
By default, there is no suggested validation csv files, but in cross-validation (using --cv) the train/validation splits in each fold are extracted from the cv_files paths specified in params.yml.
In src/experiments.py you can find the call to the function that preprocesses the raw images. For some images we have defined a special set of parameters to be used to ensure text is successfully removed from the images during preprocessing. We have documented every step of the preprocessing function to make it more udnerstandable - feel free to modify it if you want to have your own preprocessed images!
The Dockerfile and packages used in this project could be found in the docker folder.

Citation

If you use this work, please cite our paper:

@article{sorkhei2021csaw,
  title={CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer},
  author={Sorkhei, Moein and Liu, Yue and Azizpour, Hossein and Azavedo, Edward and Dembrower, Karin and Ntoula, Dimitra and Zouzos, Athanasios and Strand, Fredrik and Smith, Kevin},
  year={2021}
}

Questions or suggestions?

Please feel free to contact us in case you have any questions or suggestions!

CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Related tags

Overview

CSAW-M

Training and evaluation

Important arguments defined in in the main module

Other notes

Citation

Questions or suggestions?

Owner

Yue Liu

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

The Official Repository for "Generalized OOD Detection: A Survey"

InsCLR: Improving Instance Retrieval with Self-Supervision

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

PyTorch ,ONNX and TensorRT implementation of YOLOv4

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021

PyTorch code for training MM-DistillNet for multimodal knowledge distillation

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

Pytorch implementation of RED-SDS (NeurIPS 2021).

DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Sandbox for training deep learning networks

Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Related tags

Overview

CSAW-M

Training and evaluation

Important arguments defined in in the main module

Other notes

Citation

Questions or suggestions?

Owner

Yue Liu

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

The Official Repository for "Generalized OOD Detection: A Survey"

InsCLR: Improving Instance Retrieval with Self-Supervision

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

PyTorch ,ONNX and TensorRT implementation of YOLOv4

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021

PyTorch code for training MM-DistillNet for multimodal knowledge distillation

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

Pytorch implementation of RED-SDS (NeurIPS 2021).

DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Sandbox for training deep learning networks

Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.