Clockwork Convnets for Video Semantic Segmentation

This is the reference implementation of arxiv:1608.03609:

Clockwork Convnets for Video Semantic Segmentation
Evan Shelhamer*, Kate Rakelly*, Judy Hoffman*, Trevor Darrell
arXiv:1605.06211

This project reproduces results from the arxiv and demonstrates how to execute staged fully convolutional networks (FCNs) on video in Caffe by controlling the net through the Python interface. In this way this these experiments are a proof-of-concept implementation of clockwork, and further development is needed to achieve peak efficiency (such as pre-fetching video data layers, threshold GPU layers, and a native Caffe library edition of the staged forward pass for pipelining).

For simple reference, refer to these (display only) editions of the experiments:

Cityscapes Clockwork
YouTube Frame Differencing
YouTube Clockwork
YouTube Pipelining
Synthetic PASCAL VOC Video
Dataset Walkthroughs for YouTube, NYUDv2, and Cityscapes

Contents

notebooks: interactive code and documentation that carries out the experiments (in jupyter/ipython format).
nets: the net specification of the various FCNs in this work, and the pre-trained weights (see installation instructions).
caffe: the Caffe framework, included as a git submodule pointing to a compatible version
datasets: input-output for PASCAL VOC, NYUDv2, YouTube-Objects, and Cityscapes
lib: helpers for executing networks, scoring metrics, and plotting

License

This project is licensed for open non-commercial distribution under the UC Regents license; see LICENSE. Its dependencies, such as Caffe, are subject to their own respective licenses.

Requirements & Installation

Caffe, Python, and Jupyter are necessary for all of the experiments. Any installation or general Caffe inquiries should be directed to the caffe-users mailing list.

Install Caffe. See the installation guide and try Caffe through Docker (recommended). Make sure to configure pycaffe, the Caffe Python interface, too.
Install Python, and then install our required packages listed in requirements.txt. For instance, for x in $(cat requirements.txt); do pip install $x; done should do.
Install Jupyter, the interface for viewing, executing, and altering the notebooks.
Configure your PYTHONPATH as indicated by the included .envrc so that this project dir and pycaffe are included.
Download the model weights for this project and place them in nets.

Now you can explore the notebooks by firing up Jupyter.

Clockwork Convnets for Video Semantic Segmentation

Related tags

Overview

Clockwork Convnets for Video Semantic Segmentation

License

Requirements & Installation

Owner

Evan Shelhamer

Retinal vessel segmentation based on GT-UNet

A collection of implementations of deep domain adaptation algorithms

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

One-line your code easily but still with the fun of doing so!

Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

Tom-the-AI - A compound artificial intelligence software for Linux systems.

Generative code template for PixelBeasts 10k NFT project.

Asterisk is a framework to generate high-quality training datasets at scale

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Analysis of rationale selection in neural rationale models

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins