[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

Overview

DeepDeform (CVPR'2020)

DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow images and 4,479 foreground object masks. We also provide 149,228 sparse match annotations and 63,512 occlusion point annotations.

Download Data

If you would like to download the DeepDeform data, please fill out this google form and, once accepted, we will send you the link to download the data.

Online Benchmark

If you want to participate in the benchmark(s), you can submit your results at DeepDeform Benchmark website.

Currently we provide benchmarks for the following tasks:

By uploading your results on the test set to the DeepDeform Benchmark website the performance of you method is automatically evaluated on the hidden test labels, and compared to other already evaluated methods. You can decide if you want to make the evaluation results public or not.

If you want to evaluate on validation set, we provide code that is used for evaluation of specific benchmarks in directory evaluation/. To evaluate optical flow or non-rigid reconstruction, you need to adapt FLOW_RESULTS_DIR or RECONSTRUCTION_RESULTS_DIR in config.py to correspond to your results directory (that would be in the same format as for the online submission, described here).

In order to evaluate reconstruction, you need to compile additional C++ modules.

  • Install necessary dependencies:
pip install pybind11
pip install Pillow
pip install plyfile
pip install tqdm
pip install scikit-image
  • Inside the evaluation/csrc adapt includes.py to point to your Eigen include directory.

  • Compile the code by executing the following in evaluation/csrc:

python setup.py install

Data Organization

Data is organized into 3 subsets, train, val, and test directories, using 340-30-30 sequence split. In every subset each RGB-D sequence is stored in a directory <sequence_id>, which follows the following format:

<sequence_id>
|-- <color>: color images for every frame (`%06d.jpg`)
|-- <depth>: depth images for every frame (`%06d.png`)
|-- <mask>: mask images for a few frames (`%06d.png`)
|-- <optical_flow>: optical flow images for a few frame pairs (`<object_id>_<source_id>_<target_id>.oflow` or `%s_%06d_%06d.oflow`)
|-- <scene_flow>: scene flow images for a few frame pairs (`<object_id>_<source_id>_<target_id>.sflow` or `%s_%06d_%06d.sflow`)
|-- <intrinsics.txt>: 4x4 intrinsics matrix

All labels are provided in .json files in root dataset r directory:

  • train_matches.json and val_matches.json:
    Manually annotated sparse matches.
  • train_dense.json and val_dense.json:
    Densely aligned optical and scene flow images with the use of sparse matches as a guidance.
  • train_selfsupervised.json and val_selfsupervised.json:
    Densely aligned optical and scene flow images using self-supervision (DynamicFusion pipeline) for a few sequences. - train_selfsupervised.json and `val_skaldir
  • train_masks.json and val_masks.json:
    Dynamic object annotations for a few frames per sequence.
  • train_occlusions.json and val_occlusions.json:
    Manually annotated sparse occlusions.

Data Formats

We recommend you to test out scripts in demo/ directory in order to check out loading of different file types.

RGB-D Data: 3D data is provided as RGB-D video sequences, where color and depth images are already aligned. Color images are provided as 8-bit RGB .jpg, and depth images as 16-bit .png (divide by 1000 to obtain depth in meters).

Camera Parameters: A 4x4 intrinsic matrix is given for every sequence (because different cameras were used for data capture, every sequence can have different intrinsic matrix). Since the color and depth images are aligned, no extrinsic transformation is necessary.

Optical Flow Data: Dense optical flow data is provided as custom binary image of resolution 640x480 with extension .oflow. Every pixel contains two values for flow in x and y direction, in pixels. Helper function to load/store binary flow images is provided in utils.py.

Scene Flow Data: Dense scene flow data is provided as custom binary image of resolution 640x480 with extension .sflow. Every pixel contains 3 values for flow in x, y and z direction, in meters. Helper function to load/store binary flow images is provided in utils.py.

Object Mask Data: A few frames per sequences also include foreground dynamic object annotation. The mask image is given as 16-bit .png image (1 for object, 0 for background).

Sparse Match Annotations: We provide manual sparse match annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of source and target pixels.

Sparse Occlusion Annotations: We provide manual sparse occlusion annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of occluded pixels in source frame.

Citation

If you use DeepDeform data or code please cite:

@inproceedings{bozic2020deepdeform, 
    title={DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data}, 
    author={Bo{\v{z}}i{\v{c}}, Alja{\v{z}} and Zollh{\"o}fer, Michael and Theobalt, Christian and Nie{\ss}ner, Matthias}, 
    journal={Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2020}
}

Help

If you have any questions, please contact us at [email protected], or open an issue at Github.

License

The data is released under DeepDeform Terms of Use, and the code is release under a non-comercial creative commons license.

Owner
Aljaz Bozic
PhD Student at Visual Computing Group
Aljaz Bozic
ARAE-Tensorflow for Discrete Sequences (Adversarially Regularized Autoencoder)

ARAE Tensorflow Code Code for the paper Adversarially Regularized Autoencoders for Generating Discrete Structures by Zhao, Kim, Zhang, Rush and LeCun

19 Nov 12, 2021
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

Facebook Research 253 Jan 06, 2023
PClean: A Domain-Specific Probabilistic Programming Language for Bayesian Data Cleaning

PClean: A Domain-Specific Probabilistic Programming Language for Bayesian Data Cleaning Warning: This is a rapidly evolving research prototype.

MIT Probabilistic Computing Project 190 Dec 27, 2022
Plotting points that lie on the intersection of the given curves using gradient descent.

Plotting intersection of curves using gradient descent Webapp Link --- What's the app about Why this app Plotting functions and their intersection. A

Divakar Verma 2 Jan 09, 2022
Pydantic models for pywttr and aiopywttr.

Pydantic models for pywttr and aiopywttr.

Almaz 2 Dec 08, 2022
Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

"A Probabilistic Hard Attention Model For Sequentially Observed Scenes" Authors: Samrudhdhi Rangrej, James Clark Accepted to: BMVC'21 A recurrent atte

5 Nov 19, 2022
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

Jinkun Cao 325 Jan 05, 2023
A model to classify a piece of news as REAL or FAKE

Fake_news_classification A model to classify a piece of news as REAL or FAKE. This python project of detecting fake news deals with fake and real news

Gokul Stark 1 Jan 29, 2022
Extending JAX with custom C++ and CUDA code

Extending JAX with custom C++ and CUDA code This repository is meant as a tutorial demonstrating the infrastructure required to provide custom ops in

Dan Foreman-Mackey 237 Dec 23, 2022
A python/pytorch utility library

A python/pytorch utility library

Jiaqi Gu 5 Dec 02, 2022
A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

24 Dec 13, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 08, 2022
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape

IDSIA 36 Nov 15, 2022
True Few-Shot Learning with Language Models

This codebase supports using language models (LMs) for true few-shot learning: learning to perform a task using a limited number of examples from a single task distribution.

Ethan Perez 124 Jan 04, 2023
PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction Introduction This is official PyTorch implementation of Towards Accurate Alignment

TANG Xiao 96 Dec 27, 2022
Fast EMD for Python: a wrapper for Pele and Werman's C++ implementation of the Earth Mover's Distance metric

PyEMD: Fast EMD for Python PyEMD is a Python wrapper for Ofir Pele and Michael Werman's implementation of the Earth Mover's Distance that allows it to

William Mayner 433 Dec 31, 2022