We are More than Our JOints: Predicting How 3D Bodies Move

Overview

We are More than Our JOints: Predicting How 3D Bodies Move

Citation

This repo contains the official implementation of our paper MOJO:

@inproceedings{Zhang:CVPR:2021,
  title = {We are More than Our Joints: Predicting how {3D} Bodies Move},
  author = {Zhang, Yan and Black, Michael J. and Tang, Siyu},
  booktitle = {Proceedings IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  month = jun,
  year = {2021},
  month_numeric = {6}
}

License

We employ CC BY-NC-SA 4.0 for the MOJO code, which covers

models/fittingop.py
experiments/utils/batch_gen_amass.py
experiments/utils/utils_canonicalize_amass.py
experiments/utils/utils_fitting_jts2mesh.py
experiments/utils/vislib.py
experiments/vis_*_amass.py

The rest part are developed based on DLow. According to their license, the implementation follows its CMU license.

Environment & code structure

  • Tested OS: Linux Ubuntu 18.04
  • Packages:
  • Note: All scripts should be run from the root of this repo to avoid path issues. Also, please fix some path configs in the code, otherwise errors will occur.

Training

The training is split to two steps. Provided we have a config file in experiments/cfg/amass_mojo_f9_nsamp50.yml, we can do

  • python experiments/train_MOJO_vae.py --cfg amass_mojo_f9_nsamp50 to train the MOJO
  • python experiments/train_MOJO_dlow.py --cfg amass_mojo_f9_nsamp50 to train the DLow

Evaluation

These experiments/eval_*.py files are for evaluation. For eval_*_pred.py, they can be used either to evaluate the results while predicting, or to save results to a file for further evaluation and visualization. An example is python experiments/eval_kps_pred.py --cfg amass_mojo_f9_nsamp50 --mode vis, which is to save files to the folder results/amass_mojo_f9_nsamp50.

Generation

In MOJO, the recursive projection scheme is to get 3D bodies from markers and keep the body valid. The relevant implementation is mainly in models/fittingop.py and experiments/test_recursive_proj.py. An example to run is

python experiments/test_recursive_proj.py --cfg amass_mojo_f9_nsamp50 --testdata ACCAD --gpu_index 0

Datasets

In MOJO, we have used AMASS, Human3.6M, and HumanEva.

For Human3.6M and HumanEva, we follow the same pre-processing step as in DLow, VideoPose3D, and others. Please refer to their pages, e.g. this one, for details.

For AMASS, we perform canonicalization of motion sequences with our own procedures. The details are in experiments/utils_canonicalize_amass.py. We find this sequence canonicalization procedure is important. The canonicalized AMASS used in our work can be downloaded here, which includes the random sample names of ACCAD and BMLhandball used in our experiments about motion realism.

Models

For human body modeling, we employ the SMPL-X parametric body model. You need to follow their license and download. Based on SMPL-X, we can use the body joints or a sparse set of body mesh vertices (the body markers) to represent the body.

  • CMU It has 41 markers, the corresponding SMPL-X mesh vertex ID can be downloaded here.
  • SSM2 It has 64 markers, the corresponding SMPL-X mesh vertex ID can be downloaded here.
  • Joints We used 22 joints. No need to download, but just obtain them from the SMPL-X body model. See details in the code.

Our CVAE model configurations are in experiments/cfg. The pre-trained checkpoints can be downloaded here.

Related projects

  • AMASS: It unifies diverse motion capture data with the SMPL-H model, and provides a large-scale high-quality dataset. Its official codebase and tutorials are in this github repo.

  • GRAB: Most mocap data only contains the body motion. GRAB, however, provides high-quality data of human-object interactions. Besides capturing the body motion, the object motion and the hand-object contact are captured simultaneously. More demonstrations are in its github repo.

Acknowledgement & disclaimer

We thank Nima Ghorbani for the advice on the body marker setting and the {\bf AMASS} dataset. We thank Yinghao Huang, Cornelia K"{o}hler, Victoria Fern'{a}ndez Abrevaya, and Qianli Ma for proofreading. We thank Xinchen Yan and Ye Yuan for discussions on baseline methods. We thank Shaofei Wang and Siwei Zhang for their help with the user study and the presentation, respectively.

MJB has received research gift funds from Adobe, Intel, Nvidia, Facebook, and Amazon. While MJB is a part-time employee of Amazon, his research was performed solely at, and funded solely by, Max Planck. MJB has financial interests in Amazon Datagen Technologies, and Meshcapade GmbH.

Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

Vaibhaw 4 Nov 16, 2022
This reposityory contains the PyTorch implementation of our paper "Generative Dynamic Patch Attack".

Generative Dynamic Patch Attack This reposityory contains the PyTorch implementation of our paper "Generative Dynamic Patch Attack". Requirements PyTo

Xiang Li 8 Nov 17, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
R3Det based on mmdet 2.19.0

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Installation # install mmdetection first if you haven't installed it

SJTU-Thinklab-Det 38 Dec 15, 2022
Image Segmentation Animation using Quadtree concepts.

QuadTree Image Segmentation Animation using QuadTree concepts. Usage usage: quad.py [-h] [-fps FPS] [-i ITERATIONS] [-ws WRITESTART] [-b] [-img] [-s S

Alex Eidt 29 Dec 25, 2022
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX

The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.

International Business Machines 1.2k Jan 04, 2023
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

TWIST: Self-Supervised Learning by Estimating Twin Class Distributions Codes and pretrained models for TWIST: @article{wang2021self, title={Self-Sup

Bytedance Inc. 85 Dec 15, 2022
DCSL - Generalizable Crowd Counting via Diverse Context Style Learning

DCSL Generalizable Crowd Counting via Diverse Context Style Learning Requirement

3 Jun 13, 2022
Code for Fold2Seq paper from ICML 2021

[ICML2021] Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design Environment file: environment.yml Data and Feat

International Business Machines 43 Dec 04, 2022
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
DeepGNN is a framework for training machine learning models on large scale graph data.

DeepGNN Overview DeepGNN is a framework for training machine learning models on large scale graph data. DeepGNN contains all the necessary features in

Microsoft 45 Jan 01, 2023
CUda Matrix Multiply library.

cumm CUda Matrix Multiply library. cumm is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I de

49 Dec 27, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

238 Dec 22, 2022
A deep learning model for style-specific music generation.

DeepJ: A model for style-specific music generation https://arxiv.org/abs/1801.00887 Abstract Recent advances in deep neural networks have enabled algo

Henry Mao 704 Nov 23, 2022
OOD Generalization and Detection (ACL 2020)

Pretrained Transformers Improve Out-of-Distribution Robustness How does pretraining affect out-of-distribution robustness? We create an OOD benchmark

littleRound 57 Jan 09, 2023
Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

Surface Reconstruction from 3D Line Segments Surface reconstruction from 3d line segments. Langlois, P. A., Boulch, A., & Marlet, R. In 2019 Internati

85 Jan 04, 2023
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 05, 2023
Vision-Language Pre-training for Image Captioning and Question Answering

VLP This repo hosts the source code for our AAAI2020 work Vision-Language Pre-training (VLP). We have released the pre-trained model on Conceptual Cap

Luowei Zhou 373 Jan 03, 2023
Evaluating saliency methods on artificial data with different background types

Evaluating saliency methods on artificial data with different background types This repository contains the relevant code for the MedNeurips 2021 subm

2 Jul 05, 2022