Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Overview

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

This is a full project of image segmentation using the model built with U-Net Algorithm on Carvana competition Dataset from Kaggle using Sagemaker as Udacity's ML Nanodegree Capstone Project.

Image Segmentation with U-Net Algorithm

Use AWS Sagemaker to train the model built with U-Net algorithm/architecture that can perform image segmentation on Carvana Dataset from Kaggle Competition.

Project Set Up and Installation

Enter AWS through the gateway and create a Sagemaker notebook instance of your choice, ml.t2.medium is a sweet spot for this project as we will not use the GPU in the notebook and will use the Sagemaker Container to train the model. Wait for the instance to launch and then create a jupyter notebook with conda_pytorch_latest_p36 kernel, this comes preinstalled with the needed modules related to pytorch we will use along the project. Set up your sagemaker roles and regions.

Dataset

We use the Carvana Dataset from Kaggle Competition to use as data for the model training job. To get the Dataset. Register or Login to your Kaggle account, create new api in the user setting and get the api key and put it in the root of your sagemaker environment root location. After that !kaggle competitions download carvana-image-masking-challenge -f train.zip and !kaggle competitions download carvana-image-masking-challenge -f train_masks.zip will download the necessary files to your notebook environment. We will then unzip the data, upload it to S3 bucket with !aws s3 sync command.

Script Files used

  1. hpo.py for hyperparameter tuning jobs where we train the model for multiple time with different hyperparameters and search for the best combination based on loss metrics.
  2. training.py for the final training of the model with the best parameters getting from the previous tuning jobs, and put debug and profiler hooks for debugging purpose and get the tensors emits during training.
  3. inference.py for using the trained model as inference and pre-processing and serializing the data before it passes to the model for segmentaion. Now this can be used locally and user friendly
  4. Note at this time, the sagemaker endpoint has an error and can't make prediction, so I have managed to create a new instance in sagemaker(ml.g4dn.xlarge to utilize the GPU) and used endpoint_local.ipynb notebook to get the inference result.
  5. requirements.txt is use to install the dependencies in the training container, these include Albumentations, higher version of torch dependencies to utilize in the training script.

Hyperparameter Tuning

I used U-Net Algorithm to create an image segmentation model. The hyperparameter searchspaces are learning-rate, number of epochs and batchsize. Note The batch size over 128(inclusive) can't be used as the GPU memory may run out during the training. Deploy a hyperparameter tuning job on sagemaker and wait for the combination of hyperparameters turn out with best metric.

hyperparameter tuning job

We pick the hyperparameters from the best training job to train the final model.

best job's hyperparameters

Debugging and Profiling

The Debugger Hook is set to record the Loss Criterion of the process in both training and validation/testing. The Plot of the Dice Coefficient is shown below.

Dice Coefficient

we can see that the validation plot is high and this means that our model had entered a state of overtraining. We can reduce this by adding dropout or L1 L2 regularization, or added more different training data, or can early stop the model before it overfit. by adding the metric definition, I could also managed to get the average accuracy and loss dat during the validation phase in AWS Cloudwatch(a powerful too to monitor your metrics of any kind). Metrics

Results

Result is pretty good, as I was using ml.g4dn.xlarge to utilize the GPU of the instance, both the hpo jobs and training job did't take too much time.

Inferenceing your data

Sagemaker Endpoint got an 500 status code error so I tried using another sagemaker instance with GPU(ml.g4dn.xlarge) and running the endpoint_local.ipynb will get you the desired output of your choice. Result

Thank You So Much For Your Time! Please don't hesitate to contribute.

Ref: Github repo of neirinzaralwin

Owner
Htin Aung Lu
I am a Machine Learning enginner. I like to work on various machine learning projects. I have more experience on @AWS @Sagemaker platform than other.
Htin Aung Lu
A Pytree Module system for Deep Learning in JAX

Treex A Pytree-based Module system for Deep Learning in JAX Intuitive: Modules are simple Python objects that respect Object-Oriented semantics and sh

Cristian Garcia 216 Dec 20, 2022
[ICML 2021] A fast algorithm for fitting robust decision trees.

GROOT: Growing Robust Trees Growing Robust Trees (GROOT) is an algorithm that fits binary classification decision trees such that they are robust agai

Cyber Analytics Lab 17 Nov 21, 2022
E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

End-to-end Music Remastering System This repository includes source code and pre

Junghyun (Tony) Koo 37 Dec 15, 2022
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

Re-implementation of the paper 'Grokking: Generalization beyond overfitting on small algorithmic datasets' Paper Original paper can be found here Data

Tom Lieberum 38 Aug 09, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
Collection of generative models in Pytorch version.

pytorch-generative-model-collections Original : [Tensorflow version] Pytorch implementation of various GANs. This repository was re-implemented with r

Hyeonwoo Kang 2.4k Dec 31, 2022
Research - dataset and code for 2016 paper Learning a Driving Simulator

the people's comma the paper Learning a Driving Simulator the comma.ai driving dataset 7 and a quarter hours of largely highway driving. Enough to tra

comma.ai 4.1k Jan 02, 2023
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 02, 2023
Utilities to bridge Canvas-generated course rosters with GitLab's API.

gitlab-canvas-utils A collection of scripts originally written for CSE 13S. Oversees everything from GitLab course group creation, student repository

Eugene Chou 5 Jun 08, 2022
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

UNION Automatic Evaluation Metric described in the paper UNION: An UNreferenced MetrIc for Evaluating Open-eNded Story Generation (EMNLP 2020). Please

50 Dec 30, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

ppg-vc Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC) This repo implements different kinds of PPG-based VC models. Pretrained models. More m

Liu Songxiang 227 Dec 28, 2022
Experiments for distributed optimization algorithms

Network-Distributed Algorithm Experiments -- This repository contains a set of optimization algorithms and objective functions, and all code needed to

Boyue Li 40 Dec 04, 2022
This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

How to Implement a First-Order Low-Pass Filter in Discrete Time We often teach or learn about filters in continuous time, but then need to implement t

Joshua Marshall 4 Aug 24, 2022
An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Decoupled-Contrastive-Learning This repository is an implementation for the loss function proposed in Decoupled Contrastive Loss paper. Requirements P

Ramin Nakhli 71 Dec 04, 2022
LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and

TuZheng 405 Jan 04, 2023
Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

hung anna 96 Nov 09, 2022