TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Overview

segmentation

TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e-lab/ENet-training) and the Keras implementation by PavlosMelissinos (https://github.com/PavlosMelissinos/enet-keras), trained on the Cityscapes dataset (https://www.cityscapes-dataset.com/).

  • Youtube video of results (https://youtu.be/HbPhvct5kvs):

  • demo video with results

  • The results in the video can obviously be improved, but because of limited computing resources (personally funded Azure VM) I did not perform any further hyperparameter tuning.


You might get the error "No gradient defined for operation 'MaxPoolWithArgmax_1' (op type: MaxPoolWithArgmax)". To fix this, I had to add the following code to the file /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_grad.py:

@ops.RegisterGradient("MaxPoolWithArgmax")  
def _MaxPoolGradWithArgmax(op, grad, unused_argmax_grad):  
  return gen_nn_ops._max_pool_grad_with_argmax(op.inputs[0], grad, op.outputs[1], op.get_attr("ksize"), op.get_attr("strides"), padding=op.get_attr("padding"))  

Documentation:

preprocess_data.py:

  • ASSUMES: that all Cityscapes training (validation) image directories have been placed in data_dir/cityscapes/leftImg8bit/train (data_dir/cityscapes/leftImg8bit/val) and that all corresponding ground truth directories have been placed in data_dir/cityscapes/gtFine/train (data_dir/cityscapes/gtFine/val).
  • DOES: script for performing all necessary preprocessing of images and labels.

model.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: contains the ENet_model class.

utilities.py:

  • ASSUMES: -
  • DOES: contains a number of functions used in different parts of the project.

train.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: script for training the model.

run_on_sequence.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: runs a model checkpoint (set in line 56) on all frames in a Cityscapes demo sequence directory (set in line 30) and creates a video of the result.

Training details:

  • In the paper the authors suggest that you first pretrain the encoder to categorize downsampled regions of the input images, I did however train the entire network from scratch.

  • Batch size: 4.

  • For all other hyperparameters I used the same values as in the paper.

  • Training loss:

  • training loss

  • Validation loss:

  • validation loss

  • The results in the video above was obtained with the model at epoch 23, for which a checkpoint is included in segmentation/training_logs/best_model in the repo.


Training on Microsoft Azure:

To train the model, I used an NC6 virtual machine on Microsoft Azure. Below I have listed what I needed to do in order to get started, and some things I found useful. For reference, my username was 'fregu856':

#!/bin/bash

# DEFAULT VALUES
GPUIDS="0"
NAME="fregu856_GPU"


NV_GPU="$GPUIDS" nvidia-docker run -it --rm \
        -p 5584:5584 \
        --name "$NAME""$GPUIDS" \
        -v /home/fregu856:/root/ \
        tensorflow/tensorflow:latest-gpu bash
  • /root/ will now be mapped to /home/fregu856 (i.e., $ cd -- takes you to the regular home folder).

  • To start the image:

    • $ sudo sh start_docker_image.sh
  • To commit changes to the image:

    • Open a new terminal window.
    • $ sudo docker commit fregu856_GPU0 tensorflow/tensorflow:latest-gpu
  • To stop the image when it’s running:

    • $ sudo docker stop fregu856_GPU0
  • To exit the image without killing running code:

    • Ctrl-P + Q
  • To get back into a running image:

    • $ sudo docker attach fregu856_GPU0
  • To open more than one terminal window at the same time:

    • $ sudo docker exec -it fregu856_GPU0 bash
  • To install the needed software inside the docker image:

    • $ apt-get update
    • $ apt-get install nano
    • $ apt-get install sudo
    • $ apt-get install wget
    • $ sudo apt-get install libopencv-dev python-opencv
    • Commit changes to the image (otherwise, the installed packages will be removed at exit!)
Owner
Fredrik Gustafsson
PhD student whose research focuses on probabilistic deep learning for automotive computer vision applications.
Fredrik Gustafsson
Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

NorCal Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation On Model Calibration for Long-Tailed Object Detec

Tai-Yu (Daniel) Pan 24 Dec 25, 2022
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Dynamic hair modeling from monocular videos using deep neural networks

Dynamic Hair Modeling The source code of the networks for our paper "Dynamic hair modeling from monocular videos using deep neural networks" (SIGGRAPH

53 Oct 18, 2022
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022
The Unsupervised Reinforcement Learning Benchmark (URLB)

The Unsupervised Reinforcement Learning Benchmark (URLB) URLB provides a set of leading algorithms for unsupervised reinforcement learning where agent

259 Dec 26, 2022
Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Handheld Multi-Frame Neural Depth Refinement This is the official code repository for the work: The Implicit Values of A Good Hand Shake: Handheld Mul

55 Dec 14, 2022
Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Ng Kam Woh 71 Dec 22, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 07, 2023
The Official Repository for "Generalized OOD Detection: A Survey"

Generalized Out-of-Distribution Detection: A Survey 1. Overview This repository is with our survey paper: Title: Generalized Out-of-Distribution Detec

Jingkang Yang 338 Jan 03, 2023
Deep motion generator collections

GenMotion GenMotion (/gen’motion/) is a Python library for making skeletal animations. It enables easy dataset loading and experiment sharing for synt

23 May 24, 2022
Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Kaggle-titanic This is a tutorial in an IPython Notebook for the Kaggle competition, Titanic Machine Learning From Disaster. The goal of this reposito

Andrew Conti 800 Dec 15, 2022
A ssl analyzer which could analyzer target domain's certificate.

ssl_analyzer A ssl analyzer which could analyzer target domain's certificate. Analyze the domain name ssl certificate information according to the inp

vincent 17 Dec 12, 2022
Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
PyTorch implementation of EfficientNetV2

[NEW!] Check out our latest work involution accepted to CVPR'21 that introduces a new neural operator, other than convolution and self-attention. PyTo

Duo Li 375 Jan 03, 2023
A Pytorch loader for MVTecAD dataset.

MVTecAD A Pytorch loader for MVTecAD dataset. It strictly follows the code style of common Pytorch datasets, such as torchvision.datasets.CIFAR10. The

Jiyuan 1 Dec 27, 2021
TRIQ implementation

TRIQ Implementation TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment. Installation Clone this repository. Inst

Junyong You 115 Dec 30, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and i

yifan liu 147 Dec 03, 2022
Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)

Geometry-Aware Learning of Maps for Camera Localization This is the PyTorch implementation of our CVPR 2018 paper "Geometry-Aware Learning of Maps for

NVIDIA Research Projects 321 Nov 26, 2022
Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Learning Synthetic Environments and Reward Networks for Reinforcement Learning We explore meta-learning agent-agnostic neural Synthetic Environments (

AutoML-Freiburg-Hannover 16 Sep 02, 2022