Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Overview

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

report PWC

This repository is the official tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos" in CVPR 2021 (Oral Presentation) (Best Paper Nominated).

Project Page
TikTok Dataset

Teaser Image

This codebase provides:

  • Inference code
  • Training code
  • Visualization code

Requirements

(This code is tested with tensorflow-gpu 1.14.0, Python 3.7.4, CUDA 10 (version 10.0.130) and cuDNN 7 (version 7.4.2).)

  • numpy
  • imageio
  • matplotlib
  • scikit-image
  • scipy==1.1.0
  • tensorflow-gpu==1.14.0
  • gast==0.2.2
  • Pillow

Installation

Run the following code to install all pip packages:

pip install -r requirements.txt 

In case there is a problem, you can use the following tensorflow docker container "(tensorflow:19.02-py3)":

sudo docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorflow:19.02-py3

Then install the requirements:

pip install -r requirements.txt 

Inference Demo

Input:

The test data dimension should be: 256x256. For any test data you should have 3 .png files: (For an example please take a look at the demo data in "test_data" folder.)

  • name_img.png : The 256x256x3 test image
  • name_mask.png : The 256x256 corresponding mask. You can use any off-the-shelf tools such as removebg to remove the background and get the mask.
  • name_dp.png : The 256x256x3 corresponding DensePose.

Output:

Running the demo generates the following:

  • name.txt : The 256x256 predicted depth
  • name_mesh.obj : The reconstructed mesh. You can use any off-the-shelf tools such as MeshLab to visualize the mesh. Visualization for demo data from different views:

Teaser Image

  • name_normal_1.txt, name_normal_2.txt, name_normal_3.txt : Three 256x256 predicted normal. If you concatenate them in the third axis it will give you the 256x256x3 normal map.
  • name_results.png : visualization of predicted depth heatmap and the predicted normal map. Visualization for demo data:

Teaser Image

Run the demo:

Download the weights from here and extract in the main repository or run this in the main repository:

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1UOHkmwcWpwt9r11VzOCa_CVamwHVaobV' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1UOHkmwcWpwt9r11VzOCa_CVamwHVaobV" -O model.zip && rm -rf /tmp/cookies.txt

unzip model.zip

Run the following python code:

python HDNet_Inference.py

From line 26 to 29 under "test path and outpath" you can choose the input directory (default: './test_data'), ouput directory (default: './test_data/infer_out') and if you want to save the visualization (default: True).

More Results

Teaser Image

Training

To train the network, go to training folder and read the README file

MATLAB Visualization

If you want to generate visualizations similar to those on the website, go to MATLAB_Visualization folder and run

make_video.m

From lines 7 to 14, you can choose the test folder (default: test_data) and the image name to process (default: 0043). This will generate a video of the prediction from different views (default: "test_data/infer_out/video/0043/video.avi") This process will take around 2 minutes to generate 164 angles.

Note that this visualization will always generate a 672 × 512 video, You may want to resize your video accordingly for your own tested data.

Citation

If you find the code or our dataset useful in your research, please consider citing the paper.

@InProceedings{Jafarian_2021_CVPR_TikTok,
    author    = {Jafarian, Yasamin and Park, Hyun Soo},
    title     = {Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {12753-12762}} 
Owner
Yasamin Jafarian
PhD Candidate at the University of Minnesota.
Yasamin Jafarian
The implementation of the lifelong infinite mixture model

Lifelong infinite mixture model 📋 This is the implementation of the Lifelong infinite mixture model 📋 Accepted by ICCV 2021 Title : Lifelong Infinit

Fei Ye 5 Oct 20, 2022
VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

What's New Below we share, in reverse chronological order, the updates and new releases in VISSL. All VISSL releases are available here. [Oct 2021]: V

Meta Research 2.9k Jan 07, 2023
Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

Viewer for MuJoCo in Python Interactive renderer to use with the official Python bindings for MuJoCo. Starting with version 2.1.2, MuJoCo comes with n

Rohan P. Singh 62 Dec 30, 2022
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
The end-to-end platform for building voice products at scale

Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog

Picovoice 318 Jan 07, 2023
Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

DeepXML Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents Architectures and algorithms DeepXML supports

Extreme Classification 49 Nov 06, 2022
Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-CondDETR and DELA-Cond

Wen Wang 41 Dec 12, 2022
CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

CvT-ASSD including extra CvT, CvT-SSD, VGG-ASSD models original-code-website: https://github.com/albert-jin/CvT-SSD new-code-website: https://github.c

金伟强 -上海大学人工智能小渣渣~ 5 Mar 07, 2022
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising (CVPR 2020 Oral & TPAMI 2021)

ELD The implementation of CVPR 2020 (Oral) paper "A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising" and its journal (TPAMI) v

Kaixuan Wei 359 Jan 01, 2023
OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 211 Jan 02, 2023
PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks This repo contains the PyTorch implementation of the ACL, 2021 pa

Rabeeh Karimi Mahabadi 98 Dec 28, 2022
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

carwin 235 Jan 07, 2023
Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection. Mask-aware IoU for Anchor Assignment

Kemal Oksuz 46 Sep 29, 2022
Energy consumption estimation utilities for Jetson-based platforms

This repository contains a utility for measuring energy consumption when running various programs in NVIDIA Jetson-based platforms. Currently TX-2, NX, and AGX are supported.

OpenDR 10 Jun 17, 2022
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022
A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.

DrQA A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA). Reading comprehension is a task to produ

Runqi Yang 394 Nov 08, 2022
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021
PyTorch implementation of Barlow Twins.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction PyTorch implementation of Barlow Twins. @article{zbontar2021barlow, title={Barlow Tw

Facebook Research 839 Dec 29, 2022
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023