ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp; 

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs
  • When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper
@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}  
@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}  

TODO

  • data pre-processing scripts
  • evaluation data and scripts
  • code clean up
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

VRDP (NeurIPS 2021) Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language Mingyu Ding, Zhenfang Chen, Tao Du, Pin

Mingyu Ding 36 Sep 20, 2022
JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

jstDoS If you are going to skid that, please give credits ! ^^ ¿How works? This

apolo 4 Feb 11, 2022
Official Repository for "Robust On-Policy Data Collection for Data Efficient Policy Evaluation" (NeurIPS 2021 Workshop on OfflineRL).

Robust On-Policy Data Collection for Data-Efficient Policy Evaluation Source code of Robust On-Policy Data Collection for Data-Efficient Policy Evalua

Autonomous Agents Research Group (University of Edinburgh) 2 Oct 09, 2022
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

TalkNet 2 [WIP] TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Predictio

Rishikesh (ऋषिकेश) 69 Dec 17, 2022
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

Andres Mauricio Rondon Patiño 24 Oct 22, 2022
Semantic Segmentation for Aerial Imagery using Convolutional Neural Network

This repo has been deprecated because whole things are re-implemented by using Chainer and I did refactoring for many codes. So please check this newe

Shunta Saito 27 Sep 23, 2022
ColossalAI-Benchmark - Performance benchmarking with ColossalAI

Benchmark for Tuning Accuracy and Efficiency Overview The benchmark includes our

HPC-AI Tech 31 Oct 07, 2022
A Python reference implementation of the CF data model

cfdm A Python reference implementation of the CF data model. References Compliance with FAIR principles Documentation https://ncas-cms.github.io/cfdm

NCAS CMS 25 Dec 13, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

MixHop and N-GCN ⠀ A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019)

Benedek Rozemberczki 393 Dec 13, 2022
Syntax-Aware Action Targeting for Video Captioning

Syntax-Aware Action Targeting for Video Captioning Code for SAAT from "Syntax-Aware Action Targeting for Video Captioning" (Accepted to CVPR 2020). Th

59 Oct 13, 2022
A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

Lidar with Velocity A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud. related paper: Lidar with Velocity : Motion

ISEE Research Group 164 Dec 30, 2022
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized C

Sam Bond-Taylor 139 Jan 04, 2023
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

MoEBERT This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022). Installation Create an

Simiao Zuo 34 Dec 24, 2022
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
Learning Visual Words for Weakly-Supervised Semantic Segmentation

[IJCAI 2021] Learning Visual Words for Weakly-Supervised Semantic Segmentation Implementation of IJCAI 2021 paper Learning Visual Words for Weakly-Sup

Lixiang Ru 24 Oct 05, 2022
This repository contains all code and data for the Inside Out Visual Place Recognition task

Inside Out Visual Place Recognition This repository contains code and instructions to reproduce the results for the Inside Out Visual Place Recognitio

15 May 21, 2022
Informal Persian Universal Dependency Treebank

Informal Persian Universal Dependency Treebank (iPerUDT) Informal Persian Universal Dependency Treebank, consisting of 3000 sentences and 54,904 token

Roya Kabiri 0 Jan 05, 2022