PyTorch implementation of ENet

Overview

PyTorch-ENet

PyTorch (v1.1.0) implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from the lua-torch implementation ENet-training created by the authors.

This implementation has been tested on the CamVid and Cityscapes datasets. Currently, a pre-trained version of the model trained in CamVid and Cityscapes is available here.

Dataset Classes 1 Input resolution Batch size Epochs Mean IoU (%) GPU memory (GiB) Training time (hours)2
CamVid 11 480x360 10 300 52.13 4.2 1
Cityscapes 19 1024x512 4 300 59.54 5.4 20

1 When referring to the number of classes, the void/unlabeled class is always excluded.
2 These are just for reference. Implementation, datasets, and hardware changes can lead to very different results. Reference hardware: Nvidia GTX 1070 and an AMD Ryzen 5 3600 3.6GHz. You can also train for 100 epochs or so and get similar mean IoU (± 2%).
3 Test set.
4 Validation set.

Installation

Local pip

  1. Python 3 and pip
  2. Set up a virtual environment (optional, but recommended)
  3. Install dependencies using pip: pip install -r requirements.txt

Docker image

  1. Build the image: docker build -t enet .
  2. Run: docker run -it --gpus all --ipc host enet

Usage

Run main.py, the main script file used for training and/or testing the model. The following options are supported:

python main.py [-h] [--mode {train,test,full}] [--resume]
               [--batch-size BATCH_SIZE] [--epochs EPOCHS]
               [--learning-rate LEARNING_RATE] [--lr-decay LR_DECAY]
               [--lr-decay-epochs LR_DECAY_EPOCHS]
               [--weight-decay WEIGHT_DECAY] [--dataset {camvid,cityscapes}]
               [--dataset-dir DATASET_DIR] [--height HEIGHT] [--width WIDTH]
               [--weighing {enet,mfb,none}] [--with-unlabeled]
               [--workers WORKERS] [--print-step] [--imshow-batch]
               [--device DEVICE] [--name NAME] [--save-dir SAVE_DIR]

For help on the optional arguments run: python main.py -h

Examples: Training

python main.py -m train --save-dir save/folder/ --name model_name --dataset name --dataset-dir path/root_directory/

Examples: Resuming training

python main.py -m train --resume True --save-dir save/folder/ --name model_name --dataset name --dataset-dir path/root_directory/

Examples: Testing

python main.py -m test --save-dir save/folder/ --name model_name --dataset name --dataset-dir path/root_directory/

Project structure

Folders

  • data: Contains instructions on how to download the datasets and the code that handles data loading.
  • metric: Evaluation-related metrics.
  • models: ENet model definition.
  • save: By default, main.py will save models in this folder. The pre-trained models can also be found here.

Files

  • args.py: Contains all command-line options.
  • main.py: Main script file used for training and/or testing the model.
  • test.py: Defines the Test class which is responsible for testing the model.
  • train.py: Defines the Train class which is responsible for training the model.
  • transforms.py: Defines image transformations to convert an RGB image encoding classes to a torch.LongTensor and vice versa.
Owner
David Silva
:eyes:🚗
David Silva
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 05, 2023
Real-Time Semantic Segmentation in Mobile device

Real-Time Semantic Segmentation in Mobile device This project is an example project of semantic segmentation for mobile real-time app. The architectur

708 Jan 01, 2023
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

Edresson Casanova 390 Dec 29, 2022
Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

Lung Segmentation (2D) Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images. Demo See the application of the

163 Sep 21, 2022
Lava-DL, but with PyTorch-Lightning flavour

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Sami BARCHID 4 Oct 31, 2022
Neural style in TensorFlow! 🎨

neural-style An implementation of neural style in TensorFlow. This implementation is a lot simpler than a lot of the other ones out there, thanks to T

Anish Athalye 5.5k Dec 29, 2022
Ensemble Visual-Inertial Odometry (EnVIO)

Ensemble Visual-Inertial Odometry (EnVIO) Authors : Jae Hyung Jung, Yeongkwon Choe, and Chan Gook Park 1. Overview This is a ROS package of Ensemble V

Jae Hyung Jung 95 Jan 03, 2023
HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty

HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty Giorgio Cantarini, Francesca Odone, Nicoletta Noceti, Federi

18 Aug 02, 2022
Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision

Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision Project | PDF | Poster Fangyu Li, N. Dinesh Reddy, X

25 Dec 21, 2022
The ARCA23K baseline system

ARCA23K Baseline System This is the source code for the baseline system associated with the ARCA23K dataset. Details about ARCA23K and the baseline sy

4 Jul 02, 2022
Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

Tryolabs 1.7k Jan 05, 2023
4D Human Body Capture from Egocentric Video via 3D Scene Grounding

4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project] [Paper] Installation: Our method requires the same dependencies as SMPLif

Miao Liu 37 Nov 08, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

130 Dec 11, 2022
This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems.

Amortized Assimilation This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems. Abstract: T

4 Aug 16, 2022
Unsupervised Representation Learning by Invariance Propagation

Unsupervised Learning by Invariance Propagation This repository is the official implementation of Unsupervised Learning by Invariance Propagation. Pre

FengWang 15 Jul 06, 2022
Auto-Lama combines object detection and image inpainting to automate object removals

Auto-Lama Auto-Lama combines object detection and image inpainting to automate object removals. It is build on top of DE:TR from Facebook Research and

44 Dec 09, 2022
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

Weakly_detector Tensorflow implementation of "Learning Deep Features for Discriminative Localization" B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and

Taeksoo Kim 363 Jun 29, 2022