U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Last update: Jan 06, 2022

Overview

U-Net Implementation

By Christopher Ley

This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

This data set is a Binary Segmentation exercise of ~400 test images of cars from various angles such as those shown here:

Initial implementation for Binary Segmentation

The implementation performs almost as the winners of the competition (Dice: 0.9926 vs 0.99733) after only 5 epoch and we would expect the results to be as good as the winners using this architecture with more training and a little tweaking of the training hyper-parameters.

Here are the scores for training over 5 epochs by running:

(DeepLearning): python3 train.py

Training Results

0%|          | 0/540 [00:00<?, ?it/s]Accuracy: 103298971/467927040 = 22.08%
Dice score: 0.36127230525016785
100%|██████████| 540/540 [05:59<00:00,  1.50it/s, loss=0.0949]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:39_epoch_0.pth.tar
Accuracy: 460498379/467927040 = 98.41%
Dice score: 0.9652246236801147
100%|██████████| 540/540 [05:59<00:00,  1.50it/s, loss=0.0469]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:48_epoch_1.pth.tar
Accuracy: 461809183/467927040 = 98.69%
Dice score: 0.9711439609527588
100%|██████████| 540/540 [05:56<00:00,  1.51it/s, loss=0.0283]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:56_epoch_2.pth.tar
Accuracy: 465675737/467927040 = 99.52%
Dice score: 0.9891990423202515
100%|██████████| 540/540 [06:00<00:00,  1.50it/s, loss=0.0194]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_13:04_epoch_3.pth.tar
Accuracy: 465397979/467927040 = 99.46%
Dice score: 0.9878408908843994
100%|██████████| 540/540 [06:00<00:00,  1.50it/s, loss=0.0142]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_13:12_epoch_4.pth.tar
Accuracy: 466399501/467927040 = 99.67%
Dice score: 0.9926225543022156

And an example of the output vs the ground truth of the validation set, I removed whole makes for the validation set, all 16 angles, the network had never seen this particular make from any angle.

Ground Truth

Prediction

Although limited in scope (binary segmentation for only cars), this architecture performs well with multiclass segmentation, I extended this to apply segmentation to the NYUv2 which is a multiclass objective, with little modification to the above code.

I will clean this up and upload the results and modifications soon!

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Related tags

Overview

U-Net Implementation

By Christopher Ley

Initial implementation for Binary Segmentation

Training Results

Ground Truth

Prediction

Owner

Christopher Ley

Code for database and frontend of webpage for Neural Fields in Visual Computing and Beyond.

Denoising Normalizing Flow

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

Asynchronous Advantage Actor-Critic in PyTorch

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS.

make ASCII Art by Deep Learning

Plugin for Gaffer providing direct acess to asset from PolyHaven.com. Only HDRIs at the moment, Cycles and Arnold supported

This repository collects 100 papers related to negative sampling methods.

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Official implementation of the ICLR 2021 paper

Python utility to generate filesystem content for Obsidian.

Improving Deep Network Debuggability via Sparse Decision Layers

unet-family: Ultimate version

Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

CVAT is free, online, interactive video and image annotation tool for computer vision