Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Last update: Dec 27, 2022

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Official source code. Appears at CVPR 2022

This repository provides a new dataset, named InstaOrder, that can be used to understand the geometrical relationships of instances in an image. The dataset consists of 2.9M annotations of geometric orderings for class-labeled instances in 101K natural scenes. The scenes were annotated by 3,659 crowd-workers regarding (1) occlusion order that identifies occluder/occludee and (2) depth order that describes ordinal relations that consider relative distance from the camera. This repository also introduce a geometric order prediction network called InstaOrderNet, which is superior to state-of-the-art approaches.

Installation

This code has been developed under Anaconda(Python 3.6), Pytorch 1.7.1, torchvision 0.8.2 and CUDA 10.1. Please install following environments:

# build conda environment
conda create --name order python=3.6
conda activate order

# install requirements
pip install -r requirements.txt

# install COCO API
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Visualization

Check InstaOrder_vis.ipynb to visualize InstaOrder dataset including instance masks, occlusion order, and depth order.

Training

The experiments folder contains train and test scripts of experiments demonstrated in the paper.

To train {MODEL} with {DATASET},

Download {DATASET} following this.
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml
(Optional) To train InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt

Run the script file as follow:

sh experiments/{DATASET}/{MODEL}/train.sh

# Example of training InstaOrderNet^o (Table3 in the main paper) from the scratch
sh experiments/InstaOrder/InstaOrderNet_o/train.sh

Inference

Download pretrained models InstaOrder_ckpt.zip (3.5G) and unzip files following the below structure. Pretrained models are named by {DATASET}_{MODEL}.pth.tar

${base_dir}
|--data
|    |--out
|    |    |--InstaOrder_ckpt
|    |    |    |--COCOA_InstaOrderNet_o.pth.tar
|    |    |    |--COCOA_OrderNet.pth.tar
|    |    |    |--COCOA_pcnet_m.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_d.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_od.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_d.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_o.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_od.pth.tar
|    |    |    |--InstaOrder_OrderNet.pth.tar
|    |    |    |--InstaOrder_OrderNet_ext.pth.tar  
|    |    |    |--InstaOrder_pcnet_m.pth.tar
|    |    |    |--KINS_InstaOrderNet_o.pth.tar
|    |    |    |--KINS_OrderNet.pth.tar
|    |    |    |--KINS_pcnet_m.pth.tar

(Optional) To test InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml

To test {MODEL} with {DATASET}, run the script file as follow:

sh experiments/{DATASET}/{MODEL}/test.sh

# Example of reproducing the accuracy of InstaOrderNet^o (Table3 in the main paper)
sh experiments/InstaOrder/InstaOrderNet_o/test.sh

Datasets

InstaOrder dataset

To use InstaOrder, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2017/
|    |    |--val2017/
|    |    |--annotations/
|    |    |    |--instances_train2017.json
|    |    |    |--instances_val2017.json
|    |    |    |--InstaOrder_train2017.json
|    |    |    |--InstaOrder_val2017.json

COCOA dataset

To use COCOA, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2014/
|    |    |--val2014/
|    |    |--annotations/
|    |    |    |--COCO_amodal_train2014.json 
|    |    |    |--COCO_amodal_val2014.json
|    |    |    |--COCO_amodal_val2014.json

KINS dataset

To use KINS, download files following the below structure

KINS dataset

${base_dir}
|--data
|    |--KINS
|    |    |--training/
|    |    |--testing/
|    |    |--instances_val.json
|    |    |--instances_train.json

DIW dataset

To use DIW, download files following the below structure

DIW Dataset

${base_dir}
|--data
|    |--DIW
|    |    |--DIW_test/
|    |    |--DIW_Annotations
|    |    |    |--DIW_test.csv

Citing InstaOrder

If you find this code/data useful in your research then please cite our paper:

@inproceedings{lee2022instaorder,
  title={{Instance-wise Occlusion and Depth Orders in Natural Scenes}},
  author={Hyunmin Lee and Jaesik Park},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

We have reffered to and borrowed the implementations from Xiaohang Zhan

Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Installation

Visualization

Training

Inference

Datasets

InstaOrder dataset

COCOA dataset

KINS dataset

DIW dataset

Citing InstaOrder

Acknowledgement

Owner

Official implementation of Protected Attribute Suppression System, ICCV 2021

A fast, dataset-agnostic, deep visual search engine for digital art history

SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning

PRTR: Pose Recognition with Cascade Transformers

Implementation of SiameseXML (ICML 2021)

The easiest tool for extracting radiomics features and training ML models on them.

This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Transformer-Based Siamese Network for Change Detection

Improving Object Detection by Label Assignment Distillation

PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

Official Repository for "Robust On-Policy Data Collection for Data Efficient Policy Evaluation" (NeurIPS 2021 Workshop on OfflineRL).

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

This repository compare a selfie with images from identity documents and response if the selfie match.

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

A simple software for capturing human body movements using the Kinect camera.

Create animations for the optimization trajectory of neural nets

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Python code to generate art with Generative Adversarial Network

Learning Continuous Signed Distance Functions for Shape Representation

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification