A fast model to compute optical flow between two input images.

Related tags

Deep LearningDCVNet
Overview

DCVNet: Dilated Cost Volumes for Fast Optical Flow

This repository contains our implementation of the paper:

@InProceedings{jiang2021dcvnet,
  title={DCVNet: Dilated Cost Volumes for Fast Optical Flow},
  author={Jiang, Huaizu and Learned-Miller, Erik},
  booktitle={arXiv},
  year={2021}
}

Need a fast optical flow model? Try DCVNet

  • Fast. On a mid-end GTX 1080ti GPU, DCVNet runs in real time at 71 fps (frames-per-second) to process images with sizes of 1024 × 436.
  • Compact and accurate. DCVNet has 4.94M parameters and consumes 1.68GB GPU memory during inference. It achieves comparable accuracy to state-of-the-art approaches on the MPI Sintel benchmark.

In the figure above, for each model, the circle radius indicates the number of parameters (larger radius means more parameters). The center of a circle corresponds to a model’s EPE (end-point-error).

Requirements

This code has been tested with Python 3.7, PyTorch 1.6.0, and CUDA 9.2. We suggest to use a conda environment.

conda create -n dcvnet
conda activate dcvnet
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboardX scipy opencv -c pytorch
pip install yacs

We use an open-source implementation https://github.com/ClementPinard/Pytorch-Correlation-extension to compute dilated cost volumes. Follow the instructions there to install this module.

Demos

Pretrained models can be downloaded by running

./scripts/download_models.sh

or downloaded from Google drive.

You can demo a pre-trained model on a sequence of frames

python demo.py --weights-path pretrained_models/sceneflow_dcvnet.pth --path demo-frames

Required data

The following datasets are required to train and evaluate DCVNet.

We borrow the data loaders used in RAFT. By default, dcvnet/data/raft/datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

|-- datasets
    |-- Driving
        |-- frames_cleanpass
        |-- optical_flow
    |-- FlyingThings3D_subset
        |-- train
            |-- flow
            |-- image_clean
        |-- val
            |-- flow
            |-- image_clean
    |-- Monkaa
        |-- frames_cleanpass
        |-- optical_flow
    |-- MPI_Sintel
        |-- test
        |-- training
    |-- KITTI2012
        |-- testing
        |-- training
    |-- KITTI2015
        |-- testing
        |-- training
    |-- HD1K
        |-- hd1k_flow_gt
        |-- hd1k_input

Evaluation

You can evaluate a pre-trained model using tools/evaluate_optical_flow.py

python evaluate_optical_flow.py --weights_path models/dcvnet-sceneflow.pth --dataset sintel

You can optionally add the --amp switch to do inference in mixed precision to reduce GPU memory usage.

Training

We used 8 GTX 1080ti GPUs for training. Training logs will be written to the output folder, which can be visualized using tensorboard.

# train on the synthetic scene flow dataset
python tools/train_optical_flow.py --config-file configs/sceneflow_dcvnet.yaml 

# fine-tune it on the MPI-Sintel dataset
# 4 GPUs are sufficient, but here we use 8 GPUs for fast training
python tools/train_optical_flow.py --config-file configs/sintel_dcvnet.yaml --pretrain-weights output/SceneFlow/sceneflow_dcvnet/default/train_epoch_50.pth

# fine-tune it on the KITTI 2012 and 2015 dataset
# we only use 6 GPUs (3 GPUs are sufficient) since the batch size is 6
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python tools/train_optical_flow.py --config-file configs/kitti12+15_dcvnet.yaml --pretrain-weights output/Sintel+SceneFlow/sintel_dcvnet/default/train_epoch_5.pth

Note on the inference speed

In the main branch, the computation of the dilated cost volumes can be further optimized without using the for loop. Checkout the efficient branch for details. If you are interested in testing the inference speed, we suggest to switch to the efficient branch.

git checkout efficient
CUDA_VISIBLE_DEVICES=0 python tools/evaluate_optical_flow.py --dry-run

We haven't fixed this problem because our pre-trained models are based on the implementation in the main branch, which are not compatible with the resizing in the efficient branch. We need to re-train all our models. It will be fixed soon.

To-do

  • Fix the problem of efficient cost volume computation.
  • Train the model on the AutoFlow dataset.

Acknowledgment

Our implementation is built on top of RAFT, Pytorch-Correlation-extension, yacs, Detectron2, and semseg. We thank the authors for releasing and maintaining the code.

Owner
Huaizu Jiang
Assistant Professor at Northeastern University.
Huaizu Jiang
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec

Clova AI Research 34 Apr 13, 2022
A library for building and serving multi-node distributed faiss indices.

About Distributed faiss index service. A lightweight library that lets you work with FAISS indexes which don't fit into a single server memory. It fol

Meta Research 170 Dec 30, 2022
Optimize Trading Strategies Using Freqtrade

Optimize trading strategy using Freqtrade Short demo on building, testing and optimizing a trading strategy using Freqtrade. The DevBootstrap YouTube

DevBootstrap 139 Jan 01, 2023
A Closer Look at Structured Pruning for Neural Network Compression

A Closer Look at Structured Pruning for Neural Network Compression Code used to reproduce experiments in https://arxiv.org/abs/1810.04622. To prune, w

Bayesian and Neural Systems Group 140 Dec 05, 2022
Computer Vision application in the web

Computer Vision application in the web Preview Usage Clone this repo git clone https://github.com/amineHY/WebApp-Computer-Vision-streamlit.git cd Web

Amine Hadj-Youcef. PhD 35 Dec 06, 2022
In this work, we will implement some basic but important algorithm of machine learning step by step.

WoRkS continued English 中文 Français Probability Density Estimation-Non-Parametric Methods(概率密度估计-非参数方法) 1. Kernel / k-Nearest Neighborhood Density Est

liziyu0104 1 Dec 30, 2021
Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch] Abstract Snapshot compressive imaging (SCI) can rec

integirty 6 Nov 01, 2022
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

36 Jan 05, 2023
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

YITUTech 1k Dec 31, 2022
A Flow-based Generative Network for Speech Synthesis

WaveGlow: a Flow-based Generative Network for Speech Synthesis Ryan Prenger, Rafael Valle, and Bryan Catanzaro In our recent paper, we propose WaveGlo

NVIDIA Corporation 2k Dec 26, 2022
Emblaze - Interactive Embedding Comparison

Emblaze - Interactive Embedding Comparison Emblaze is a Jupyter notebook widget for visually comparing embeddings using animated scatter plots. It bun

CMU Data Interaction Group 77 Nov 24, 2022
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"

CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c

203 Dec 30, 2022
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
A Temporal Extension Library for PyTorch Geometric

Documentation | External Resources | Datasets PyTorch Geometric Temporal is a temporal (dynamic) extension library for PyTorch Geometric. The library

Benedek Rozemberczki 1.9k Jan 07, 2023
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Optimized Einsum Optimized Einsum: A tensor contraction order optimizer Optimized einsum can significantly reduce the overall execution time of einsum

Daniel Smith 653 Dec 30, 2022
This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

Learning to propose objects This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Ko

Philipp Krähenbühl 90 Sep 10, 2021
PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

REMIND Your Neural Network to Prevent Catastrophic Forgetting This is a PyTorch implementation of the REMIND algorithm from our ECCV-2020 paper. An ar

Tyler Hayes 72 Nov 27, 2022
Message Passing on Cell Complexes

CW Networks This repository contains the code used for the papers Weisfeiler and Lehman Go Cellular: CW Networks (Under review) and Weisfeiler and Leh

Twitter Research 108 Jan 05, 2023
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

NNI Doc | 简体中文 NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture

Microsoft 12.4k Dec 31, 2022
Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the

Simon Guist 27 Jun 06, 2022