Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Overview

VCN: Volumetric correspondence networks for optical flow

[project website]

Requirements

Pre-trained models

To test on any two images

Running visualize.ipynb gives you the following flow visualizations with color and vectors. Note: the sintel model "./weights/sintel-ft-trainval/finetune_67999.tar" is trained on multiple datasets and generalizes better than the KITTI model.

KITTI

This correspondens to the entry on the leaderboard (Fl-all=6.30%).

Evaluate on KITTI-15 benchmark

To run + visualize on KITTI-15 test set,

modelname=kitti-ft-trainval
i=149999
CUDA_VISIBLE_DEVICES=0 python submission.py --dataset 2015test --datapath dataset/kitti_scene/testing/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/finetune_$i.tar  --maxdisp 512 --fac 2
python eval_tmp.py --path ./weights/$modelname/ --vis yes --dataset 2015test
Evaluate on KITTI-val

To see the details of the train-val split, please scroll down to "note on train-val" and run dataloader/kitti15list_val.py, dataloader/kitti15list_train.py, dataloader/sitnellist_train.py, and dataloader/sintellist_val.py.

To evaluate on the 40 validation images of KITTI-15 (0,5,...195), (also assuming the data is at /ssd/kitti_scene)

modelname=kitti-ft-trainval
i=149999
CUDA_VISIBLE_DEVICES=0 python submission.py --dataset 2015 --datapath /ssd/kitti_scene/training/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/finetune_$i.tar  --maxdisp 512 --fac 2
python eval_tmp.py --path ./weights/$modelname/ --vis no --dataset 2015

To evaluate + visualize on KITTI-15 validation set,

python eval_tmp.py --path ./weights/$modelname/ --vis yes --dataset 2015

Evaluation error on 40 validation images : Fl-err = 3.9, EPE = 1.144

Sintel

This correspondens to the entry on the leaderboard (EPE-all-final = 4.404, EPE-all-clean = 2.808).

Evaluate on Sintel-val

To evaluate on Sintel validation set,

modelname=sintel-ft-trainval
i=67999
CUDA_VISIBLE_DEVICES=0 python submission.py --dataset sintel --datapath /ssd/rob_flow/training/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/finetune_$i.tar  --maxdisp 448 --fac 1.4
python eval_tmp.py --path ./weights/$modelname/ --vis no --dataset sintel

Evaluation error on sintel validation images: Fl-err = 7.9, EPE = 2.351

Train the model

We follow the same stage-wise training procedure as prior work: Chairs->Things->KITTI or Chairs->Things->Sintel, but uses much lesser iterations. If you plan to train the model and reproduce the numbers, please check out our supplementary material for the differences in hyper-parameters with FlowNet2 and PWCNet.

Pretrain on flying chairs and flying things

Make sure you have downloaded flying chairs and flying things subset, and placed them under the same folder, say /ssd/.

To first train on flying chairs for 140k iterations with a batchsize of 8, run (assuming you have two gpus)

CUDA_VISIBLE_DEVICES=0,1 python main.py --maxdisp 256 --fac 1 --database /ssd/ --logname chairs-0 --savemodel /data/ptmodel/  --epochs 1000 --stage chairs --ngpus 2

Then we want to fine-tune on flying things for 80k iterations with a batchsize of 8, resume from your pre-trained model or use our pretrained model

CUDA_VISIBLE_DEVICES=0,1 python main.py --maxdisp 256 --fac 1 --database /ssd/ --logname things-0 --savemodel /data/ptmodel/  --epochs 1000 --stage things --ngpus 2 --loadmodel ./weights/charis/finetune_141999.tar --retrain false

Note that to resume the number of iterations, put the iteration to start from in iter_counts-(your suffix).txt. In this example, I'll put 141999 in iter_counts-0.txt. Be aware that the program reads/writes to iter_counts-(suffix).txt at training time, so you may want to use different suffix when multiple training programs are running at the same time.

Finetune on KITTI / Sintel

Please first download the kitti 2012/2015 flow dataset if you want to fine-tune on kitti. Download rob_devkit if you want to fine-tune on sintel.

To fine-tune on KITTI with a batchsize of 16, run

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --maxdisp 512 --fac 2 --database /ssd/ --logname kitti-trainval-0 --savemodel /data/ptmodel/  --epochs 1000 --stage 2015trainval --ngpus 4 --loadmodel ./weights/things/finetune_211999.tar --retrain true

To fine-tune on Sintel with a batchsize of 16, run

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --maxdisp 448 --fac 1.4 --database /ssd/ --logname sintel-trainval-0 --savemodel /data/ptmodel/  --epochs 1000 --stage sinteltrainval --ngpus 4 --loadmodel ./weights/things/finetune_239999.tar --retrain true

Note on train-val

  • To tune hyper-parameters, we use a train-val split for kitti and sintel, which is not covered by the above procedure.
  • For kitti we use every 5th image in the training set (0,5,10,...195) for validation, and the rest for training; while for Sintel, we manually select several sequences for validation.
  • If you plan to use our split, put "--stage 2015train" or "--stage sinteltrain" for training.
  • The numbers in Tab.3 of the paper is on the whole train-val set (all the data with ground-truth).
  • You might find run.sh helpful to run evaluation on KITTI/Sintel.

Measure FLOPS

python flops.py

gives

PWCNet: flops(G)/params(M):90.8/9.37

VCN: flops(G)/params(M):96.5/6.23

Note on inference time

The current implementation runs at 180ms/pair on KITTI-sized images at inference time. A rough breakdown of running time is: feature extraction - 4.9%, feature correlation - 8.7%, separable 4D convolutions - 56%, trun. soft-argmin (soft winner-take-all) - 20% and hypotheses fusion - 9.5%. A detailed breakdown is shown below in the form "name-level percentage".

Note that separable 4D convolutions use less FLOPS than 2D convolutions (i.e., feature extraction module + hypotheses fusion module, 47.8 v.s. 53.3 Gflops) but take 4X more time (56% v.s. 14.4%). One reason might be that pytorch (also other packages) is more friendly to networks with more feature channels than those with large spatial size given the same Flops. This might be fixed at the conv kernel / hardware level.

Besides, the truncated soft-argmin is implemented with 3D max pooling, which is inefficient and takes more time than expected.

Acknowledgement

Thanks ClementPinard, Lyken17, NVlabs and many others for open-sourcing their code.

Citation

@inproceedings{yang2019vcn,
  title={Volumetric Correspondence Networks for Optical Flow},
  author={Yang, Gengshan and Ramanan, Deva},
  booktitle={NeurIPS},
  year={2019}
}
A copy of Ares that costs 30 fucking dollars.

Finalement, j'ai décidé d'abandonner cette idée, je me suis comporté comme un enfant qui été en colère. Comme m'ont dit certaines personnes j'ai des c

Bleu 24 Apr 14, 2022
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

Yutian Liu 2 Jan 29, 2022
MANO hand model porting for the GraspIt simulator

Learning Joint Reconstruction of Hands and Manipulated Objects - ManoGrasp Porting the MANO hand model to GraspIt! simulator Yana Hasson, Gül Varol, D

Lucas Wohlhart 10 Feb 08, 2022
ML-based medical imaging using Azure

Disclaimer This code is provided for research and development use only. This code is not intended for use in clinical decision-making or for any other

Microsoft Azure 68 Dec 23, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023
An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

Kazakh Named Entity Recognition This repository contains an open-source Kazakh named entity recognition dataset (KazNERD), named entity annotation gui

ISSAI 9 Dec 23, 2022
Atomistic Line Graph Neural Network

Table of Contents Introduction Installation Examples Pre-trained models Quick start using colab JARVIS-ALIGNN webapp Peformances on a few datasets Use

National Institute of Standards and Technology 91 Dec 30, 2022
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

DeepNER An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models. This repository contains complex Deep

Derrick 9 May 30, 2022
prior-based-losses-for-medical-image-segmentation

Repository for papers: Benchmark: Effect of Prior-based Losses on Segmentation Performance: A Benchmark Midl: A Surprisingly Effective Perimeter-based

Rosana EL JURDI 9 Sep 07, 2022
tensorflow code for inverse face rendering

InverseFaceRender This is tensorflow code for our project: Learning Inverse Rendering of Faces from Real-world Videos. (https://arxiv.org/abs/2003.120

Yuda Qiu 18 Nov 16, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

253 Dec 24, 2022
Neural network pruning for finding a sparse computational model for controlling a biological motor task.

MothPruning Scientific Overview Originally inspired by biological nervous systems, deep neural networks (DNNs) are powerful computational tools for mo

Olivia Thomas 0 Dec 14, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
A Java implementation of the experiments for the paper "k-Center Clustering with Outliers in Sliding Windows"

OutliersSlidingWindows A Java implementation of the experiments for the paper "k-Center Clustering with Outliers in Sliding Windows" Dataset generatio

PaoloPellizzoni 0 Jan 05, 2022
NAS-HPO-Bench-II is the first benchmark dataset for joint optimization of CNN and training HPs.

NAS-HPO-Bench-II API Overview NAS-HPO-Bench-II is the first benchmark dataset for joint optimization of CNN and training HPs. It helps a fair and low-

yoichi hirose 8 Nov 21, 2022
PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Pytorch EO Deep Learning for Earth Observation applications and research. 🚧 This project is in early development, so bugs and breaking changes are ex

earthpulse 28 Aug 25, 2022
Speech-Emotion-Analyzer - The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Speech Emotion Analyzer The idea behind creating this project was to build a machine learning model that could detect emotions from the speech we have

Mitesh Puthran 965 Dec 24, 2022
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Mathis Petrovich 248 Dec 23, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022