Code for "Learning to Segment Rigid Motions from Two Frames".

Last update: Nov 21, 2022

Overview

rigidmask

Code for "Learning to Segment Rigid Motions from Two Frames".

** This is a partial release with inference and evaluation code. The project is still being tested and documented. There might be implemention changes in the future release. Thanks for your interest.

Visuals on Sintel/KITTI/Coral (not temporally smoothed):

If you find this work useful, please consider citing:

@article{yang2021rigidmask,
  title={Learning to Segment Rigid Motions from Two Frames},
  author={Yang, Gengshan and Ramanan, Deva},
  journal={arXiv preprint arXiv:2101.03694},
  year={2021}
}

Data and precomputed results

Download

Additional inputs (coral reef images) and precomputed results are hosted on google drive. Run (assuming you have installed gdown)

gdown https://drive.google.com/uc?id=1Up2cPCjzd_HGafw1AB2ijGmiKqaX5KTi -O ./input.tar.gz
gdown https://drive.google.com/uc?id=12C7rl5xS66NpmvtTfikr_2HWL5SakLVY -O ./rigidmask-sf-precomputed.zip
tar -xzvf ./input.tar.gz 
unzip ./rigidmask-sf-precomputed.zip -d precomputed/

To compute the results in Tab.1, Tab.2 on KITTI,

modelname=rigidmask-sf
python eval/eval_seg.py  --path precomputed/$modelname/  --dataset 2015
python eval/eval_sf.py   --path precomputed/$modelname/  --dataset 2015

Install

The code is tested with python 3.8, pytorch 1.7.0, and CUDA 10.2. Install dependencies by

conda env create -f rigidmask.yml
conda activate rigidmask_v0
pip install kornia
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html

Compile DCNv2 and ngransac.

cd models/networks/DCNv2/; python setup.py install; cd -
cd models/ngransac/; python setup.py install; cd -

Pretrained models

Download pre-trained models to ./weights (assuming gdown is installed),

mkdir weights
mkdir weights/rigidmask-sf
mkdir weights/rigidmask-kitti
gdown https://drive.google.com/uc?id=1H2khr5nI4BrcrYMBZVxXjRBQYBcgSOkh -O ./weights/rigidmask-sf/weights.pth
gdown https://drive.google.com/uc?id=1sbu6zVeiiK1Ra1vp_ioyy1GCv_Om_WqY -O ./weights/rigidmask-kitti/weights.pth

modelname	training set	flow model	flow err. (K:Fl-err/EPE)	motion-in-depth err. (K:1e4)	seg. acc. (K:obj/K:bg/S:bg)
rigidmask-sf (mono)	SF	C+SF+V	10.9%/3.128px	120.4	90.71%/97.05%/86.72%
rigidmask-kitti (stereo)	SF+KITTI	C+SF+V->KITTI	4.1%/1.155px	49.7	95.58%/98.91%/-

** C: FlythingChairs, SF(SceneFlow including FlyingThings, Monkaa, and Driving, K: KITTI scene flow training set, V: VIPER, S: Sintel.

Inference

Run and visualize rigid segmentation of coral reef video, (pass --refine to turn on rigid motion refinement). Results will be saved at ./weights/$modelname/seq/ and a output-seg.gif file will be generated in the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-coral --datapath input/imgs/coral/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1
python eval/generate_visual.py --datapath weights/$modelname/seq-coral/ --imgpath input/imgs/coral

Run and visualize two-view depth estimation on kitti video, a output-depth.gif will be saved to the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-kitti --datapath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1.2 --refine
python eval/generate_visual.py --datapath weights/$modelname/seq-kitti/ --imgpath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx
python eval/render_scene.py --inpath weights/rigidmask-sf/seq-kitti/pc0-0000001110.ply

Run and evaluate kitti-sceneflow (monocular setup, Tab. 1 and Tab. 2),

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.2 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py   --path weights/$modelname/  --dataset 2015

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset sintel_mrflow_val --datapath path-to-sintel-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.5 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset sintel
python eval/eval_sf.py   --path weights/$modelname/  --dataset sintel

Run and evaluate kitti-sceneflow (stereo setup, Tab. 6),

modelname=rigidmask-kitti
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-images   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-train-hsm-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py    --path weights/$modelname/  --dataset 2015

To generate results for kitti-sceneflow benchmark (stereo setup, Tab. 3),

modelname=rigidmask-kitti
mkdir ./benchmark_output
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015test --datapath path-to-kitti-sceneflow-images  --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-test-ganet-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo

Training (todo)

Acknowledge (incomplete)

Source of coral reef video: Beautiful Dive in Philippines Tubbataha Atoll Coral Reef Natural Park & Whale Shark in Sulu Sea HD.
DCNv2
ngransac.
CenterNet
PolarMask

Code for "Learning to Segment Rigid Motions from Two Frames".

Related tags

Overview

rigidmask

Data and precomputed results

Install

Pretrained models

Inference

Training (todo)

Acknowledge (incomplete)

Owner

Gengshan Yang

ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

验证码识别深度学习 tensorflow 神经网络

Notebooks for my "Deep Learning with TensorFlow 2 and Keras" course

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Weight estimation in CT by multi atlas techniques

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Intelligent Video Analytics toolkit based on different inference backends.

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Automatically creates genre collections for your Plex media

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

RNN Predict Street Commercial Vitality

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

In generative deep geometry learning, we often get many obj files remain to be rendered

Multi Task RL Baselines

Code for "Learning to Segment Rigid Motions from Two Frames".

Related tags

Overview

rigidmask

Data and precomputed results

Install

Pretrained models

Inference

Training (todo)

Acknowledge (incomplete)

Owner

Gengshan Yang

ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

验证码识别 深度学习 tensorflow 神经网络

Notebooks for my "Deep Learning with TensorFlow 2 and Keras" course

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Weight estimation in CT by multi atlas techniques

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Intelligent Video Analytics toolkit based on different inference backends.

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Automatically creates genre collections for your Plex media

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

RNN Predict Street Commercial Vitality

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

In generative deep geometry learning, we often get many obj files remain to be rendered

Multi Task RL Baselines

验证码识别深度学习 tensorflow 神经网络