Image Segmentation and Object Detection in Pytorch

Overview

Image Segmentation and Object Detection in Pytorch

Pytorch-Segmentation-Detection is a library for image segmentation and object detection with reported results achieved on common image segmentation/object detection datasets, pretrained models and scripts to reproduce them.

Segmentation

PASCAL VOC 2012

Implemented models were tested on Restricted PASCAL VOC 2012 Validation dataset (RV-VOC12) or Full PASCAL VOC 2012 Validation dataset (VOC-2012) and trained on the PASCAL VOC 2012 Training data and additional Berkeley segmentation data for PASCAL VOC 12.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link Related paper
Resnet-18-8s RV-VOC12 59.0 in prog. in prog. 28 ms. Dropbox DeepLab
Resnet-34-8s RV-VOC12 68.0 in prog. in prog. 50 ms. Dropbox DeepLab
Resnet-50-16s VOC12 66.5 in prog. in prog. in prog. in prog. DeepLab
Resnet-50-8s VOC12 67.0 in prog. in prog. in prog. in prog. DeepLab
Resnet-50-8s-deep-sup VOC12 67.1 in prog. in prog. in prog. in prog. DeepLab
Resnet-101-16s VOC12 68.6 in prog. in prog. in prog. in prog. DeepLab
PSP-Resnet-18-8s VOC12 68.3 n/a n/a n/a in prog. PSPnet
PSP-Resnet-50-8s VOC12 73.6 n/a n/a n/a in prog. PSPnet

Some qualitative results:

Alt text

Endovis 2017

Implemented models were trained on Endovis 2017 segmentation dataset and the sequence number 3 was used for validation and was not included in training dataset.

The code to acquire the training and validating the model is also provided in the library.

Additional Qualitative results can be found on this youtube playlist.

Binary Segmentation

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-9-8s Seq # 3 * 96.1 in prog. in prog. 13.3 ms. Dropbox
Resnet-18-8s Seq # 3 96.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Seq # 3 in prog. in prog. in prog. 50 ms. in prog.

Resnet-9-8s network was tested on the 0.5 reduced resoulution (512 x 640).

Qualitative results (on validation sequence):

Alt text

Multi-class Segmentation

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-18-8s Seq # 3 81.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Seq # 3 in prog. in prog. in prog. 50 ms. in prog

Qualitative results (on validation sequence):

Alt text

Cityscapes

The dataset contains video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames. The annotations contain 19 classes which represent cars, road, traffic signs and so on.

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-18-32s Validation set 61.0 in prog. in prog. in prog. in prog.
Resnet-18-8s Validation set 60.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Validation set 69.1 in prog. in prog. 50 ms. Dropbox
Resnet-50-16s-PSP Validation set 71.2 in prog. in prog. in prog. in prog.

Qualitative results (on validation sequence):

Whole sequence can be viewed here.

Alt text

Installation

This code requires:

  1. Pytorch.

  2. Some libraries which can be acquired by installing Anaconda package.

    Or you can install scikit-image, matplotlib, numpy using pip.

  3. Clone the library:

git clone --recursive https://github.com/warmspringwinds/pytorch-segmentation-detection

And use this code snippet before you start to use the library:

import sys
# update with your path
# All the jupyter notebooks in the repository already have this
sys.path.append("/your/path/pytorch-segmentation-detection/")
sys.path.insert(0, '/your/path/pytorch-segmentation-detection/vision/')

Here we use our pytorch/vision fork, which might be merged and futher merged in a future. We have added it as a submodule to our repository.

  1. Download segmentation or detection models that you want to use manually (links can be found below).

About

If you used the code for your research, please, cite the paper:

@article{pakhomov2017deep,
  title={Deep Residual Learning for Instrument Segmentation in Robotic Surgery},
  author={Pakhomov, Daniil and Premachandran, Vittal and Allan, Max and Azizian, Mahdi and Navab, Nassir},
  journal={arXiv preprint arXiv:1703.08580},
  year={2017}
}

During implementation, some preliminary experiments and notes were reported:

Owner
Daniil Pakhomov
Phd student at JHU. Research interests: Image Classification, Image Segmentation, Face Detection and Face Recognition mostly based on Deep Learning.
Daniil Pakhomov
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

The Hypersim Dataset For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real i

Apple 1.3k Jan 04, 2023
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation

DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation This repository is the implementation of DynaTune paper. This folder

4 Nov 02, 2022
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022
Implementation of the Point Transformer layer, in Pytorch

Point Transformer - Pytorch Implementation of the Point Transformer self-attention layer, in Pytorch. The simple circuit above seemed to have allowed

Phil Wang 501 Jan 03, 2023
Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks Stable Neural ODE with Lyapunov-Stable Equilibrium

Kang Qiyu 8 Dec 12, 2022
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algori

Anish 324 Dec 27, 2022
Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

DeepMTA_PyTorch Officical PyTorch Implementation of "Dynamic Attention-guided Multi-TrajectoryAnalysis for Single Object Tracking", Xiao Wang, Zhe Che

Xiao Wang(王逍) 7 Dec 03, 2022
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

Pixel-Perfect Structure-from-Motion (ICCV 2021 Oral) We introduce a framework that improves the accuracy of Structure-from-Motion by refining keypoint

Computer Vision and Geometry Lab 831 Dec 29, 2022
Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

Shuangfei Zhai 18 Mar 05, 2022
A spherical CNN for weather forecasting

DeepSphere-Weather - Deep Learning on the sphere for weather/climate applications. The code in this repository provides a scalable and flexible framew

DeepSphere 47 Dec 25, 2022
AI Summer's complete catalog of articles

Learn Deep Learning with AI Summer A collection of all articles (almost 100) written for the AI Summer blog organized by topic. Deep Learning Theory M

AI Summer 95 Dec 29, 2022
This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit

BMW Semantic Segmentation GPU/CPU Inference API This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit. The train

BMW TechOffice MUNICH 56 Nov 24, 2022
Library to enable Bayesian active learning in your research or labeling work.

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

README Repository containing the code for the paper "Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions". Specifically, an

Yousef Emam 13 Nov 24, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
PyTorch META-DATASET (Few-shot classification benchmark)

PyTorch META-DATASET (Few-shot classification benchmark) This repo contains a PyTorch implementation of meta-dataset and a unified implementation of s

Malik Boudiaf 39 Oct 31, 2022
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Will Thompson 166 Jan 04, 2023
THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

Introduction Version: 2.3.8 Authors: Chris Fonnesbeck Anand Patil David Huard John Salvatier Web site: https://github.com/pymc-devs/pymc Documentation

PyMC 7.2k Jan 07, 2023
TUPÃ was developed to analyze electric field properties in molecular simulations

TUPÃ: Electric field analyses for molecular simulations What is TUPÃ? TUPÃ (pronounced as tu-pan) is a python algorithm that employs MDAnalysis engine

Marcelo D. Polêto 10 Jul 17, 2022