Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation(Unfinished)

Overview

Keras-FCN

Fully convolutional networks and semantic segmentation with Keras.

Biker Image

Biker Ground Truth

Biker as classified by AtrousFCN_Resnet50_16s

Models

Models are found in models.py, and include ResNet and DenseNet based models. AtrousFCN_Resnet50_16s is the current best performer, with pixel mean Intersection over Union mIoU 0.661076, and pixel accuracy around 0.9 on the augmented Pascal VOC2012 dataset detailed below.

Install

Useful setup scripts for Ubuntu 14.04 and 16.04 can be found in the robotics_setup repository. First use that to install CUDA, TensorFlow,

mkdir -p ~/src

cd ~/src
# install dependencies
pip install pillow keras sacred

# fork of keras-contrib necessary for DenseNet based models
git clone [email protected]:ahundt/keras-contrib.git -b densenet-atrous
cd keras-contrib
sudo python setup.py install


# Install python coco tools
cd ~/src
git clone https://github.com/pdollar/coco.git
cd coco
sudo python setup.py install

cd ~/src
git clone https://github.com/aurora95/Keras-FCN.git

Datasets

Datasets can be downloaded and configured in an automated fashion via the ahundt-keras branch on a fork of the tf_image_segmentation repository.

For simplicity, the instructions below assume all repositories are in ~/src/, and datasets are downloaded to ~/.keras/ by default.

cd ~/src
git clone [email protected]:ahundt/tf-image-segmentation.git -b Keras-FCN

Pascal VOC + Berkeley Data Augmentation

Pascal VOC 2012 augmented with Berkeley Semantic Contours is the primary dataset used for training Keras-FCN. Note that the default configuration maximizes the size of the dataset, and will not in a form that can be submitted to the pascal VOC2012 segmentation results leader board, details are below.

# Automated Pascal VOC Setup (recommended)
export PYTHONPATH=$PYTHONPATH:~/src/tf-image-segmentation
cd path/to/tf-image-segmentation/tf_image_segmentation/recipes/pascal_voc/
python data_pascal_voc.py pascal_voc_setup

This downloads and configures image/annotation filenames pairs train/val splits from combined Pascal VOC with train and validation split respectively that has image full filename/ annotation full filename pairs in each of the that were derived from PASCAL and PASCAL Berkeley Augmented dataset.

The datasets can be downloaded manually as follows:

# Manual Pascal VOC Download (not required)

    # original PASCAL VOC 2012
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar # 2 GB
    # berkeley augmented Pascal VOC
    wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz # 1.3 GB

The setup utility has three type of train/val splits(credit matconvnet-fcn):

Let BT, BV, PT, PV, and PX be the Berkeley training and validation
sets and PASCAL segmentation challenge training, validation, and
test sets. Let T, V, X the final trainig, validation, and test
sets.
Mode 1::
      V = PV (same validation set as PASCAL)
Mode 2:: (default))
      V = PV \ BT (PASCAL val set that is not a Berkeley training
      image)
Mode 3::
      V = PV \ (BV + BT)
In all cases:
      S = PT + PV + BT + BV
      X = PX  (the test set is uncahgend)
      T = (S \ V) \ X (the rest is training material)

MS COCO

MS COCO support is very experimental, contributions would be highly appreciated.

Note that there any pixel can have multiple classes, for example a pixel which is point on a cup on a table will be classified as both cup and table, but sometimes the z-ordering is wrong in the dataset. This means saving the classes as an image will result in very poor performance.

export PYTHONPATH=$PYTHONPATH:~/src/tf-image-segmentation
cd ~/src/tf-image-segmentation/tf_image_segmentation/recipes/mscoco

# Initial download is 13 GB
# Extracted 91 class segmentation encoding
# npy matrix files may require up to 1TB

python data_coco.py coco_setup
python data_coco.py coco_to_pascal_voc_imageset_txt
python data_coco.py coco_image_segmentation_stats

# Train on coco
cd ~/src/Keras-FCN
python train_coco.py

Training and testing

The default configuration trains and evaluates AtrousFCN_Resnet50_16s on pascal voc 2012 with berkeley data augmentation.

cd ~/src/Keras-FCN
cd utils

# Generate pretrained weights
python transfer_FCN.py

cd ~/src/Keras-FCN

# Run training
python train.py

# Evaluate the performance of the network
python evaluate.py

Model weights will be in ~/src/Keras-FCN/Models, along with saved image segmentation results from the validation dataset.

Key files

  • model.py
    • contains model definitions, you can use existing models or you can define your own one.
  • train.py
    • The training script. Most parameters are set in the main function, and data augmentation parameters are where SegDataGenerator is initialized, you may change them according to your needs.
  • inference.py
    • Used for infering segmentation results. It can be directly run and it's also called in evaluate.py
  • evaluate.py
    • Used for evaluating perforance. It will save all segmentation results as images and calculate IOU. Outputs are not perfectly formatted so you may need to look into the code to see the meaning.

Most parameters of train.py, inference.py, and evaluate.py are set in the main function.

Owner
Computer Vision/Quant Investment
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.

genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact

Zerite Development 5 Nov 22, 2022
[NeurIPS'21 Spotlight] PyTorch code for our paper "Aligned Structured Sparsity Learning for Efficient Image Super-Resolution"

ASSL This repository is for a new network pruning method (Aligned Structured Sparsity Learning, ASSL) for efficient single image super-resolution (SR)

Huan Wang 47 Nov 28, 2022
一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Haoyu Xu 203 Jan 03, 2023
Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Text2Music Emotion Embedding Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings Reference Emotion Embedding Spaces for Matching

Minz Won 50 Dec 05, 2022
📖 Deep Attentional Guided Image Filtering

📖 Deep Attentional Guided Image Filtering [Paper] Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao ,Xiangyang Ji Harbin Institute of Technology,

9 Dec 23, 2022
My coursework for Machine Learning (2021 Spring) at National Taiwan University (NTU)

Machine Learning 2021 Machine Learning (NTU EE 5184, Spring 2021) Instructor: Hung-yi Lee Course Website : (https://speech.ee.ntu.edu.tw/~hylee/ml/202

100 Dec 26, 2022
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

RNNT loss in Pytorch - Numba JIT compiled (warprnnt_numba) Warp RNN Transducer Loss for ASR in Pytorch, ported from HawkAaron/warp-transducer and a re

Somshubra Majumdar 15 Oct 22, 2022
A spherical CNN for weather forecasting

DeepSphere-Weather - Deep Learning on the sphere for weather/climate applications. The code in this repository provides a scalable and flexible framew

DeepSphere 47 Dec 25, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
A multi-mode modulator for multi-domain few-shot classification (ICCV)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Yanbin Liu 8 Apr 28, 2022
Fully convolutional deep neural network to remove transparent overlays from images

Fully convolutional deep neural network to remove transparent overlays from images

Marc Belmont 1.1k Jan 06, 2023
This is the implementation of the paper "Self-supervised Outdoor Scene Relighting"

Self-supervised Outdoor Scene Relighting This is the implementation of the paper "Self-supervised Outdoor Scene Relighting". The model is implemented

Ye Yu 24 Dec 17, 2022
Implementing yolov4 target detection and tracking based on nao robot

Implementing yolov4 target detection and tracking based on nao robot

6 Apr 19, 2022
EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale Paper: EgoNN: Egocentric Neural Network for Point Cloud

19 Sep 20, 2022
A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

sign-language-detection A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM. The project is built for a vocabular

Hashim 4 Feb 06, 2022
This repository is an implementation of our NeurIPS 2021 paper (Stylized Dialogue Generation with Multi-Pass Dual Learning) in PyTorch.

MPDL---TODO This repository is an implementation of our NeurIPS 2021 paper (Stylized Dialogue Generation with Multi-Pass Dual Learning) in PyTorch. Ci

CodebaseLi 3 Nov 27, 2022