Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation(Unfinished)

Overview

Keras-FCN

Fully convolutional networks and semantic segmentation with Keras.

Biker Image

Biker Ground Truth

Biker as classified by AtrousFCN_Resnet50_16s

Models

Models are found in models.py, and include ResNet and DenseNet based models. AtrousFCN_Resnet50_16s is the current best performer, with pixel mean Intersection over Union mIoU 0.661076, and pixel accuracy around 0.9 on the augmented Pascal VOC2012 dataset detailed below.

Install

Useful setup scripts for Ubuntu 14.04 and 16.04 can be found in the robotics_setup repository. First use that to install CUDA, TensorFlow,

mkdir -p ~/src

cd ~/src
# install dependencies
pip install pillow keras sacred

# fork of keras-contrib necessary for DenseNet based models
git clone [email protected]:ahundt/keras-contrib.git -b densenet-atrous
cd keras-contrib
sudo python setup.py install


# Install python coco tools
cd ~/src
git clone https://github.com/pdollar/coco.git
cd coco
sudo python setup.py install

cd ~/src
git clone https://github.com/aurora95/Keras-FCN.git

Datasets

Datasets can be downloaded and configured in an automated fashion via the ahundt-keras branch on a fork of the tf_image_segmentation repository.

For simplicity, the instructions below assume all repositories are in ~/src/, and datasets are downloaded to ~/.keras/ by default.

cd ~/src
git clone [email protected]:ahundt/tf-image-segmentation.git -b Keras-FCN

Pascal VOC + Berkeley Data Augmentation

Pascal VOC 2012 augmented with Berkeley Semantic Contours is the primary dataset used for training Keras-FCN. Note that the default configuration maximizes the size of the dataset, and will not in a form that can be submitted to the pascal VOC2012 segmentation results leader board, details are below.

# Automated Pascal VOC Setup (recommended)
export PYTHONPATH=$PYTHONPATH:~/src/tf-image-segmentation
cd path/to/tf-image-segmentation/tf_image_segmentation/recipes/pascal_voc/
python data_pascal_voc.py pascal_voc_setup

This downloads and configures image/annotation filenames pairs train/val splits from combined Pascal VOC with train and validation split respectively that has image full filename/ annotation full filename pairs in each of the that were derived from PASCAL and PASCAL Berkeley Augmented dataset.

The datasets can be downloaded manually as follows:

# Manual Pascal VOC Download (not required)

    # original PASCAL VOC 2012
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar # 2 GB
    # berkeley augmented Pascal VOC
    wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz # 1.3 GB

The setup utility has three type of train/val splits(credit matconvnet-fcn):

Let BT, BV, PT, PV, and PX be the Berkeley training and validation
sets and PASCAL segmentation challenge training, validation, and
test sets. Let T, V, X the final trainig, validation, and test
sets.
Mode 1::
      V = PV (same validation set as PASCAL)
Mode 2:: (default))
      V = PV \ BT (PASCAL val set that is not a Berkeley training
      image)
Mode 3::
      V = PV \ (BV + BT)
In all cases:
      S = PT + PV + BT + BV
      X = PX  (the test set is uncahgend)
      T = (S \ V) \ X (the rest is training material)

MS COCO

MS COCO support is very experimental, contributions would be highly appreciated.

Note that there any pixel can have multiple classes, for example a pixel which is point on a cup on a table will be classified as both cup and table, but sometimes the z-ordering is wrong in the dataset. This means saving the classes as an image will result in very poor performance.

export PYTHONPATH=$PYTHONPATH:~/src/tf-image-segmentation
cd ~/src/tf-image-segmentation/tf_image_segmentation/recipes/mscoco

# Initial download is 13 GB
# Extracted 91 class segmentation encoding
# npy matrix files may require up to 1TB

python data_coco.py coco_setup
python data_coco.py coco_to_pascal_voc_imageset_txt
python data_coco.py coco_image_segmentation_stats

# Train on coco
cd ~/src/Keras-FCN
python train_coco.py

Training and testing

The default configuration trains and evaluates AtrousFCN_Resnet50_16s on pascal voc 2012 with berkeley data augmentation.

cd ~/src/Keras-FCN
cd utils

# Generate pretrained weights
python transfer_FCN.py

cd ~/src/Keras-FCN

# Run training
python train.py

# Evaluate the performance of the network
python evaluate.py

Model weights will be in ~/src/Keras-FCN/Models, along with saved image segmentation results from the validation dataset.

Key files

  • model.py
    • contains model definitions, you can use existing models or you can define your own one.
  • train.py
    • The training script. Most parameters are set in the main function, and data augmentation parameters are where SegDataGenerator is initialized, you may change them according to your needs.
  • inference.py
    • Used for infering segmentation results. It can be directly run and it's also called in evaluate.py
  • evaluate.py
    • Used for evaluating perforance. It will save all segmentation results as images and calculate IOU. Outputs are not perfectly formatted so you may need to look into the code to see the meaning.

Most parameters of train.py, inference.py, and evaluate.py are set in the main function.

Owner
Computer Vision/Quant Investment
Taichi Course Homework Template

太极图形课S1-标题部分 这个作业未来或将是你的开源项目,标题的内容可以来自作业中的核心关键词,让读者一眼看出你所完成的工作/做出的好玩demo 如果暂时未想好,起名时可以参考“太极图形课S1-xxx作业” 如下是作业(项目)展开说明的方法,可以帮大家理清思路,并且也对读者非常友好,请小伙伴们多多参

TaichiCourse 30 Nov 19, 2022
Parameterising Simulated Annealing for the Travelling Salesman Problem

Parameterising Simulated Annealing for the Travelling Salesman Problem

Gary Sun 55 Jun 15, 2022
Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders"

DECA Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders". All the code is writte

23 Dec 01, 2022
Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-Pixel Part Segmentation [3DV 2021 Oral]

Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-Pixel Part Segmentation [3DV 2021 Oral] Learning to Disambiguate Strongly In

Zicong Fan 40 Dec 22, 2022
PyTorch implementation for View-Guided Point Cloud Completion

PyTorch implementation for View-Guided Point Cloud Completion

22 Jan 04, 2023
[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
Stochastic Normalizing Flows

Stochastic Normalizing Flows We introduce stochasticity in Boltzmann-generating flows. Normalizing flows are exact-probability generative models that

AI4Science group, FU Berlin (Frank Noé and co-workers) 50 Dec 16, 2022
Motion and Shape Capture from Sparse Markers

MoSh++ This repository contains the official chumpy implementation of mocap body solver used for AMASS: AMASS: Archive of Motion Capture as Surface Sh

Nima Ghorbani 135 Dec 23, 2022
PyTorch implementation of Barlow Twins.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction PyTorch implementation of Barlow Twins. @article{zbontar2021barlow, title={Barlow Tw

Facebook Research 839 Dec 29, 2022
PyTorch Implementations for DeeplabV3 and PSPNet

Pytorch-segmentation-toolbox DOC Pytorch code for semantic segmentation. This is a minimal code to run PSPnet and Deeplabv3 on Cityscape dataset. Shor

Zilong Huang 746 Dec 15, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 09, 2022
PyTorch implementation of EfficientNetV2

[NEW!] Check out our latest work involution accepted to CVPR'21 that introduces a new neural operator, other than convolution and self-attention. PyTo

Duo Li 375 Jan 03, 2023
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023
Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

FaceMorpher FaceMorpher is an innovative project to get a unique face morph (or interpolation for geeks) on a website. Yes, this means you can see fac

Anish 9 Jun 24, 2022
Contrastive Feature Loss for Image Prediction

Contrastive Feature Loss for Image Prediction We provide a PyTorch implementation of our contrastive feature loss presented in: Contrastive Feature Lo

Alex Andonian 44 Oct 05, 2022