FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

Last update: Jan 07, 2023

Related tags

Overview

FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

FrankMocap pursues an easy-to-use single view 3D motion capture system developed by Facebook AI Research (FAIR). FrankMocap provides state-of-the-art 3D pose estimation outputs for body, hand, and body+hands in a single system. The core objective of FrankMocap is to democratize the 3D human pose estimation technology, enabling anyone (researchers, engineers, developers, artists, and others) can easily obtain 3D motion capture outputs from videos and images.

Btw, why the name FrankMocap? Our pipeline to integrate body and hand modules reminds us of Frankenstein's monster!

News:

[2020/10/09] We have improved openGL rendering speed. It's about 40% faster. (e.g., body module: 6fps -> 11fps)

Key Features

Body Motion Capture:

Hand Motion Capture

Egocentric Hand Motion Capture

Whole body Motion Capture (body + hands)

Installation

See INSTALL.md

A Quick Start

Run body motion capture

# using a machine with a monitor to show output on screen
python -m demo.demo_bodymocap --input_path ./sample_data/han_short.mp4 --out_dir ./mocap_output

# screenless mode (e.g., a remote server)
xvfb-run -a python -m demo.demo_bodymocap --input_path ./sample_data/han_short.mp4 --out_dir ./mocap_output

Run hand motion capture

# using a machine with a monitor to show outputs on screen
python -m demo.demo_handmocap --input_path ./sample_data/han_hand_short.mp4 --out_dir ./mocap_output

# screenless mode  (e.g., a remote server)
xvfb-run -a python -m demo.demo_handmocap --input_path ./sample_data/han_hand_short.mp4 --out_dir ./mocap_output

Run whole body motion capture

# using a machine with a monitor to show outputs on screen
python -m demo.demo_frankmocap --input_path ./sample_data/han_short.mp4 --out_dir ./mocap_output

# screenless mode  (e.g., a remote server)
xvfb-run -a python -m demo.demo_frankmocap --input_path ./sample_data/han_short.mp4 --out_dir ./mocap_output

Note:
- Above commands use openGL by default. If it does not work, you may try alternative renderers (pytorch3d or openDR).
- See the readme of each module for details

Joint Order

See joint_order

Body Motion Capture Module

See run_bodymocap

Hand Motion Capture Module

See run_handmocap

Whole Body Motion Capture Module (Body + Hand)

See run_totalmocap

License

CC-BY-NC 4.0. See the LICENSE file.

References

FrankMocap is based on the following research outputs:

@article{rong2020frankmocap,
  title={FrankMocap: Fast Monocular 3D Hand and Body Motion Capture by Regression and Integration},
  author={Rong, Yu and Shiratori, Takaaki and Joo, Hanbyul},
  journal={arXiv preprint arXiv:2008.08324},
  year={2020}
}

@article{joo2020eft,
  title={Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation},
  author={Joo, Hanbyul and Neverova, Natalia and Vedaldi, Andrea},
  journal={arXiv preprint arXiv:2004.03686},
  year={2020}
}

FrankMocap leverages many amazing open-sources shared in research community.
- SMPL, SMPLX
- Detectron2
- Pytorch3D (for rendering)
- OpenDR (for rendering)
- SPIN (for body module)
- 100DOH (for hand detection)
- lightweight-human-pose-estimation (for body detection)

FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

Related tags

Overview

FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

News:

Key Features

Installation

A Quick Start

Joint Order

Body Motion Capture Module

Hand Motion Capture Module

Whole Body Motion Capture Module (Body + Hand)

License

References

Owner

Facebook Research

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

A small library of 3D related utilities used in my research.

Converting CPT to bert form for use

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Banglore House Prediction Using Flask Server (Python)

MakeItTalk: Speaker-Aware Talking-Head Animation

The toolkit to generate auto labeled datasets

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

MERLOT: Multimodal Neural Script Knowledge Models

Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Predict the latency time of the deep learning models

Element selection for functional materials discovery by integrated machine learning of atomic contributions to properties

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.

Real-time Neural Representation Fusion for Robust Volumetric Mapping

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"