Pytorch implementation of FlowNet by Dosovitskiy et al.

Overview

FlowNetPytorch

Pytorch implementation of FlowNet by Dosovitskiy et al.

This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et al. in PyTorch. See Torch implementation here

This code is mainly inspired from official imagenet example. It has not been tested for multiple GPU, but it should work just as in original code.

The code provides a training example, using the flying chair dataset , with data augmentation. An implementation for Scene Flow Datasets may be added in the future.

Two neural network models are currently provided, along with their batch norm variation (experimental) :

  • FlowNetS
  • FlowNetSBN
  • FlowNetC
  • FlowNetCBN

Pretrained Models

Thanks to Kaixhin you can download a pretrained version of FlowNetS (from caffe, not from pytorch) here. This folder also contains trained networks from scratch.

Note on networks loading

Directly feed the downloaded Network to the script, you don't need to uncompress it even if your desktop environment tells you so.

Note on networks from caffe

These networks expect a BGR input (compared to RGB in pytorch). However, BGR order is not very important.

Prerequisite

these modules can be installed with pip

pytorch >= 1.2
tensorboard-pytorch
tensorboardX >= 1.4
spatial-correlation-sampler>=0.2.1
imageio
argparse
path.py

or

pip install -r requirements.txt

Training on Flying Chair Dataset

First, you need to download the the flying chair dataset . It is ~64GB big and we recommend you put it in a SSD Drive.

Default HyperParameters provided in main.py are the same as in the caffe training scripts.

  • Example usage for FlowNetS :
python main.py /path/to/flying_chairs/ -b8 -j8 -a flownets

We recommend you set j (number of data threads) to high if you use DataAugmentation as to avoid data loading to slow the training.

For further help you can type

python main.py -h

Visualizing training

Tensorboard-pytorch is used for logging. To visualize result, simply type

tensorboard --logdir=/path/to/checkoints

Training results

Models can be downloaded here in the pytorch folder.

Models were trained with default options unless specified. Color warping was not used.

Arch learning rate batch size epoch size filename validation EPE
FlowNetS 1e-4 8 2700 flownets_EPE1.951.pth.tar 1.951
FlowNetS BN 1e-3 32 695 flownets_bn_EPE2.459.pth.tar 2.459
FlowNetC 1e-4 8 2700 flownetc_EPE1.766.pth.tar 1.766

Note : FlowNetS BN took longer to train and got worse results. It is strongly advised not to you use it for Flying Chairs dataset.

Validation samples

Prediction are made by FlowNetS.

Exact code for Optical Flow -> Color map can be found here

Input prediction GroundTruth

Running inference on a set of image pairs

If you need to run the network on your images, you can download a pretrained network here and launch the inference script on your folder of image pairs.

Your folder needs to have all the images pairs in the same location, with the name pattern

{image_name}1.{ext}
{image_name}2.{ext}
python3 run_inference.py /path/to/images/folder /path/to/pretrained

As for the main.py script, a help menu is available for additional options.

Note on transform functions

In order to have coherent transformations between inputs and target, we must define new transformations that take both input and target, as a new random variable is defined each time a random transformation is called.

Flow Transformations

To allow data augmentation, we have considered rotation and translations for inputs and their result on target flow Map. Here is a set of things to take care of in order to achieve a proper data augmentation

The Flow Map is directly linked to img1

If you apply a transformation on img1, you have to apply the very same to Flow Map, to get coherent origin points for flow.

Translation between img1 and img2

Given a translation (tx,ty) applied on img2, we will have

flow[:,:,0] += tx
flow[:,:,1] += ty

Scale

A scale applied on both img1 and img2 with a zoom parameters alpha multiplies the flow by the same amount

flow *= alpha

Rotation applied on both images

A rotation applied on both images by an angle theta also rotates flow vectors (flow[i,j]) by the same angle

\for_all i,j flow[i,j] = rotate(flow[i,j], theta)

rotate: x,y,theta ->  (x*cos(theta)-x*sin(theta), y*cos(theta), x*sin(theta))

Rotation applied on img2

Let us consider a rotation by the angle theta from the image center.

We must tranform each flow vector based on the coordinates where it lands. On each coordinate (i, j), we have:

flow[i, j, 0] += (cos(theta) - 1) * (j  - w/2 + flow[i, j, 0]) +    sin(theta)    * (i - h/2 + flow[i, j, 1])
flow[i, j, 1] +=   -sin(theta)    * (j  - w/2 + flow[i, j, 0]) + (cos(theta) - 1) * (i - h/2 + flow[i, j, 1])
Owner
Clément Pinard
PhD ENSTA Paris, Deep Learning Engineer @ ContentSquare
Clément Pinard
Code & Data for Enhancing Photorealism Enhancement

Enhancing Photorealism Enhancement Stephan R. Richter, Hassan Abu AlHaija, Vladlen Koltun Paper | Website (with side-by-side comparisons) | Video (Pap

Intelligent Systems Lab Org 1.1k Dec 31, 2022
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
This repository compare a selfie with images from identity documents and response if the selfie match.

aws-rekognition-facecompare This repository compare a selfie with images from identity documents and response if the selfie match. This code was made

1 Jan 27, 2022
Segmentation Training Pipeline

Segmentation Training Pipeline This package is a part of Musket ML framework. Reasons to use Segmentation Pipeline Segmentation Pipeline was developed

Musket ML 52 Dec 12, 2022
This program creates a formatted excel file which highlights the undervalued stock according to Graham's number.

Over-and-Undervalued-Stocks Of Nepse Using Graham's Number Scrap the latest data using different websites and creates a formatted excel file that high

6 May 03, 2022
Multiple custom object count and detection using YOLOv3-Tiny method

Electronic-Component-YOLOv3 Introduce This project created to detect, count, and recognize multiple custom object using YOLOv3-Tiny method. The target

Derwin Mahardika 2 Nov 14, 2022
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

Kaichun Mo 26 Nov 04, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
Self-Supervised Speech Pre-training and Representation Learning Toolkit.

What's New Sep 2021: We host a challenge in AAAI workshop: The 2nd Self-supervised Learning for Audio and Speech Processing! See SUPERB official site

s3prl 1.6k Jan 08, 2023
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation This reposi

First Person Vision @ Image Processing Laboratory - University of Catania 1 Aug 21, 2022
Pytorch0.4.1 codes for InsightFace

InsightFace_Pytorch Pytorch0.4.1 codes for InsightFace 1. Intro This repo is a reimplementation of Arcface(paper), or Insightface(github) For models,

1.5k Jan 01, 2023
Taming Transformers for High-Resolution Image Synthesis

Taming Transformers for High-Resolution Image Synthesis CVPR 2021 (Oral) Taming Transformers for High-Resolution Image Synthesis Patrick Esser*, Robin

CompVis Heidelberg 3.5k Jan 03, 2023
The original implementation of TNDM used in the NeurIPS 2021 paper (no longer being updated)

TNDM - Targeted Neural Dynamical Modeling Note: This code is no longer being updated. The official re-implementation can be found at: https://github.c

1 Jul 21, 2022
Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"

FAME: Feature-based Adversarial Meta-Embeddings This is the companion code for the experiments reported in the paper "FAME: Feature-Based Adversarial

Bosch Research 11 Nov 27, 2022
This is a collection of our NAS and Vision Transformer work.

AutoML - Neural Architecture Search This is a collection of our AutoML-NAS work iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vi

Microsoft 832 Jan 08, 2023
Submodular Subset Selection for Active Domain Adaptation (ICCV 2021)

S3VAADA: Submodular Subset Selection for Virtual Adversarial Active Domain Adaptation ICCV 2021 Harsh Rangwani, Arihant Jain*, Sumukh K Aithal*, R. Ve

Video Analytics Lab -- IISc 13 Dec 28, 2022
FluxTraining.jl gives you an endlessly extensible training loop for deep learning

A flexible neural net training library inspired by fast.ai

86 Dec 31, 2022
The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Character Transformations for Non-Autoregressive GEC Tagging Milan Straka, Jakub Náplava, Jana Straková Charles University Faculty of Mathematics and

ÚFAL 5 Dec 10, 2022