Code for "Layered Neural Rendering for Retiming People in Video."

Overview

Layered Neural Rendering in PyTorch

This repository contains training code for the examples in the SIGGRAPH Asia 2020 paper "Layered Neural Rendering for Retiming People in Video."

This is not an officially supported Google product.

Prerequisites

  • Linux
  • Python 3.6+
  • NVIDIA GPU + CUDA CuDNN

Installation

This code has been tested with PyTorch 1.4 and Python 3.8.

  • Install PyTorch 1.4 and other dependencies.
    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, you can create a new Conda environment using conda env create -f environment.yml.

Data Processing

  • Download the data for a video used in our paper (e.g. "reflection"):
bash ./datasets/download_data.sh reflection
  • Or alternatively, download all the data by specifying all.
  • Download the pretrained keypoint-to-UV model weights:
bash ./scripts/download_kp2uv_model.sh

The pretrained model will be saved at ./checkpoints/kp2uv/latest_net_Kp2uv.pth.

  • Generate the UV maps from the keypoints:
bash datasets/prepare_iuv.sh ./datasets/reflection

Training

  • To train a model on a video (e.g. "reflection"), run:
python train.py --name reflection --dataroot ./datasets/reflection --gpu_ids 0,1
  • To view training results and loss plots, visit the URL http://localhost:8097. Intermediate results are also at ./checkpoints/reflection/web/index.html.

You can find more scripts in the scripts directory, e.g. run_${VIDEO}.sh which combines data processing, training, and saving layer results for a video.

Note:

  • It is recommended to use >=2 GPUs, each with >=16GB memory.
  • The training script first trains the low-resolution model for --num_epochs at --batch_size, and then trains the upsampling module for --num_epochs_upsample at --batch_size_upsample. If you do not need the upsampled result, pass --num_epochs_upsample 0.
  • Training the upsampling module requires ~2.5x memory as the low-resolution model, so set batch_size_upsample accordingly. The provided scripts set the batch sizes appropriately for 2 GPUs with 16GB memory.
  • GPU memory scales linearly with the number of layers.

Saving layer results from a trained model

  • Run the trained model:
python test.py --name reflection --dataroot ./datasets/reflection --do_upsampling
  • The results (RGBA layers, videos) will be saved to ./results/reflection/test_latest/.
  • Passing --do_upsampling uses the results of the upsampling module. If the upsampling module hasn't been trained (num_epochs_upsample=0), then remove this flag.

Custom video

To train on your own video, you will have to preprocess the data:

  1. Extract the frames, e.g.
    mkdir ./datasets/my_video && cd ./datasets/my_video 
    mkdir rgb && ffmpeg -i video.mp4 rgb/%04d.png
    
  2. Resize the video to 256x448 and save the frames in my_video/rgb_256, and resize the video to 512x896 and save in my_video/rgb_512.
  3. Run AlphaPose and Pose Tracking on the frames. Save results as my_video/keypoints.json
  4. Create my_video/metadata.json following these instructions.
  5. If your video has camera motion, either (1) stabilize the video, or (2) maintain the camera motion by computing homographies and saving as my_video/homographies.txt. See scripts/run_cartwheel.sh for a training example with camera motion, and see ./datasets/cartwheel/homographies.txt for formatting.

Note: Videos that are suitable for our method have the following attributes:

  • Static camera or limited camera motion that can be represented with a homography.
  • Limited number of people, due to GPU memory limitations. We tested up to 7 people and 7 layers. Multiple people can be grouped onto the same layer, though they cannot be individually retimed.
  • People that move relative to the background (static people will be absorbed into the background layer).
  • We tested a video length of up to 200 frames (~7 seconds).

Citation

If you use this code for your research, please cite the following paper:

@inproceedings{lu2020,
  title={Layered Neural Rendering for Retiming People in Video},
  author={Lu, Erika and Cole, Forrester and Dekel, Tali and Xie, Weidi and Zisserman, Andrew and Salesin, David and Freeman, William T and Rubinstein, Michael},
  booktitle={SIGGRAPH Asia},
  year={2020}
}

Acknowledgments

This code is based on pytorch-CycleGAN-and-pix2pix.

Owner
Google
Google ❤️ Open Source
Google
Sketch-Based 3D Exploration with Stacked Generative Adversarial Networks

pix2vox [Demonstration video] Sketch-Based 3D Exploration with Stacked Generative Adversarial Networks. Generated samples Single-category generation M

Takumi Moriya 232 Nov 14, 2022
Dataset Condensation with Contrastive Signals

Dataset Condensation with Contrastive Signals This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC). T

3 May 19, 2022
Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

FGHV Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing Requirements Python 3.6 Pytorch 1.5.0 Cud

5 Jun 02, 2022
Dataset para entrenamiento de yoloV3 para 4 clases

Deteccion de objetos en video Este repo basado en el proyecto PyTorch YOLOv3 para correr detección de objetos sobre video. Construí sobre este proyect

1 Nov 01, 2021
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

PDVC Official implementation for End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021) [paper] [valse论文速递(Chinese)] This repo supports:

Teng Wang 118 Dec 16, 2022
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

TensorGeodesic Ray tracing of a Schwarzschild black hole written entirely in TensorFlow. Dependencies: Python 3 TensorFlow 2.x numpy matplotlib About

5 Jan 15, 2022
Code for the paper "Next Generation Reservoir Computing"

Next Generation Reservoir Computing This is the code for the results and figures in our paper "Next Generation Reservoir Computing". They are written

OSU QuantInfo Lab 105 Dec 20, 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Created by Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Zheng Z

Yongming Rao 322 Dec 31, 2022
A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

GainedVAE A Pytorch Implementation of a continuously rate adjustable learned image compression framework, Gained Variational Autoencoder(GainedVAE). N

39 Dec 24, 2022
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

MoEBERT This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022). Installation Create an

Simiao Zuo 34 Dec 24, 2022
[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images

Unsupervised Object-Level Representation Learning from Scene Images This repository contains the official PyTorch implementation of the ORL algorithm

Jiahao Xie 55 Dec 03, 2022
A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Simple implementation of Equivariant GNN A short implementation of E(n) Equivariant Graph Neural Networks for HOMO energy prediction. Just 50 lines of

Arsenii Senya Ashukha 97 Dec 23, 2022
Demo notebooks for Qiskit application modules demo sessions (Oct 8 & 15):

qiskit-application-modules-demo-sessions This repo hosts demo notebooks for the Qiskit application modules demo sessions hosted on Qiskit YouTube. Par

Qiskit Community 46 Nov 24, 2022
Churn prediction

Churn-prediction Churn-prediction Data preprocessing:: Label encoder is used to normalize the categorical variable Data Transformation:: For each data

1 Sep 28, 2022
Rest API Written In Python To Classify NSFW Images.

Rest API Written In Python To Classify NSFW Images.

Wahyusaputra 2 Dec 23, 2021
This is an official implementation for "Video Swin Transformers".

Video Swin Transformer By Ze Liu*, Jia Ning*, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin and Han Hu. This repo is the official implementation of "V

Swin Transformer 981 Jan 03, 2023
Underwater industrial application yolov5m6

This project wins the intelligent algorithm contest finalist award and stands out from over 2000teams in China Underwater Robot Professional Contest, entering the final of China Underwater Robot Prof

8 Nov 09, 2022