DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

Overview

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

By Anargyros Chatzitofis, Dimitris Zarpalas, Stefanos Kollias, Petros Daras.

Introduction

DeepMoCap constitutes a low-cost, marker-based optical motion capture method that consumes multiple spatio-temporally aligned infrared-depth sensor streams using retro-reflective straps and patches (reflectors).

DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust optical data extraction. To this end, the subject's motion is efficiently captured by applying a template-based fitting technique.

Teaser?

Teaser?

This project is licensed under the terms of the license.

Contents

  1. Testing
  2. Datasets
  3. Citation

Testing

For testing the FCN model, please visit "testing/" enabling the 3D optical data extraction from colorized depth and 3D optical flow input. The data should be appropriately formed and the DeepMoCap FCN model should be placed to "testing/model/keras".

The proposed FCN is evaluated on the DMC2.5D dataset measuring mean Average Precision (mAP) for the entire set, based on Percentage of Correct Keypoints (PCK) thresholds (a = 0.05). The proposed method outperforms the competitive methods as shown in the table below.

Method Total Total (without end-reflectors)
CPM 92.16% 95.27%
CPM+PAFs 92.79% 95.61%
CPM+PAFs + 3D OF 92.84% 95.67%
Proposed 93.73% 96.77%

Logo

Supplementaty material (video)

Teaser?

Datasets

Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D).

DMC2.5D

The DMC2.5D Dataset was captured in order to train and test the DeepMoCap FCN. It comprises pairs per view of:

The samples were randomly selected from 8 subjects. More specifically, 25K single-view pair samples were annotated with over 300K total keypoints (i.e., reflector 2D locations of current and previous frames on the image), trying to cover a variety of poses and movements in the scene. 20K, 3K and 2K samples were used for training, validation and testing the FCN model, respectively. The annotation was semi-automatically realized by applying image processing and 3D vision techniques, while the dataset was manually refined using the 2D-reflectorset-annotator.

Teaser?

To get the DMC2.5D dataset, please contact the owner of the repository via github or email ([email protected]).

DMC3D

Teaser?

The DMC3D dataset consists of multi-view depth and skeleton data as well as inertial and ground truth motion capture data. Specifically, 3 Kinect for Xbox One sensors were used to capture the IR-D and Kinect skeleton data along with 9 XSens MT inertial measurement units (IMU) to enable the comparison between the proposed method and inertial MoCap approaches. Further, a PhaseSpace Impulse X2 solution was used to capture ground truth MoCap data. The preparation of the DMC3D dataset required the spatio-temporal alignment of the modalities (Kinect, PhaseSpace, XSens MTs). The setup used for the Kinect recordings provides spatio-temporally aligned IR-D and skeleton frames.

Exercise # of repetitions # of frames Type
Walking on the spot 10-20 200-300 Free
Single arm raise 10-20 300-500 Bilateral
Elbow flexion 10-20 300-500 Bilateral
Knee flexion 10-20 300-500 Bilateral
Closing arms above head 6-12 200-300 Free
Side steps 6-12 300-500 Bilateral
Jumping jack 6-12 200-300 Free
Butt kicks left-right 6-12 300-500 Bilateral
Forward lunge left-right 4-10 300-500 Bilateral
Classic squat 6-12 200-300 Free
Side step + knee-elbow 6-12 300-500 Bilateral
Side reaches 6-12 300-500 Bilateral
Side jumps 6-12 300-500 Bilateral
Alternate side reaches 6-12 300-500 Bilateral
Kick-box kicking 2-6 200-300 Free

The annotation tool for the spatio-temporally alignment of the 3D data will be publicly available soon.

To get the DMC3D dataset, please contact the owner of the repository via github or email ([email protected]).

Citation

This paper has been published in MDPI Sensors, Depth Sensors and 3D Vision Special Issue [PDF]

Please cite the paper in your publications if it helps your research:


@article{chatzitofis2019deepmocap,
  title={DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors},
  author={Chatzitofis, Anargyros and Zarpalas, Dimitrios and Kollias, Stefanos and Daras, Petros},
  journal={Sensors},
  volume={19},
  number={2},
  pages={282},
  year={2019},
  publisher={Multidisciplinary Digital Publishing Institute}
}
Face Recognition and Emotion Detector Device

Face Recognition and Emotion Detector Device Orange PI 1 Python 3.10.0 + Django 3.2.9 Project's file explanation Django manage.py Django commands hand

BootyAss 2 Dec 21, 2021
Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

**Codebase and data are uploaded in progress. ** VOLT(-py) is a vocabulary learning codebase that allows researchers and developers to automaticaly ge

416 Jan 09, 2023
Using CNN to mimic the driver based on training data from Torcs

Behavioural-Cloning-in-autonomous-driving Using CNN to mimic the driver based on training data from Torcs. Approach First, the data was collected from

Sudharshan 2 Jan 05, 2022
joint detection and semantic segmentation, based on ultralytics/yolov5,

Multi YOLO V5——Detection and Semantic Segmentation Overeview This is my undergraduate graduation project which based on ultralytics YOLO V5 tag v5.0.

477 Jan 06, 2023
NP DRAW paper released code

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation This repo contains the official implementation for the NP-DRAW paper.

ZENG Xiaohui 22 Mar 13, 2022
Accelerated deep learning R&D

Accelerated deep learning R&D PyTorch framework for Deep Learning research and development. It focuses on reproducibility, rapid experimentation, and

Catalyst-Team 3.1k Jan 06, 2023
Medical image analysis framework merging ANTsPy and deep learning

ANTsPyNet A collection of deep learning architectures and applications ported to the python language and tools for basic medical image processing. Bas

Advanced Normalization Tools Ecosystem 118 Dec 24, 2022
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
Generative Art Using Neural Visual Grammars and Dual Encoders

Generative Art Using Neural Visual Grammars and Dual Encoders Arnheim 1 The original algorithm from the paper Generative Art Using Neural Visual Gramm

DeepMind 231 Jan 05, 2023
Relative Human dataset, CVPR 2022

Relative Human (RH) contains multi-person in-the-wild RGB images with rich human annotations, including: Depth layers (DLs): relative depth relationsh

Yu Sun 112 Dec 02, 2022
WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

HeadPoseEstimation-WHENet-yolov4-onnx-openvino ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L 1. Usage $ git clone htt

Katsuya Hyodo 49 Sep 21, 2022
A PyTorch-based library for fast prototyping and sharing of deep neural network models.

A PyTorch-based library for fast prototyping and sharing of deep neural network models.

78 Jan 03, 2023
Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.

Think Bayes 2 by Allen B. Downey The HTML version of this book is here. Think Bayes is an introduction to Bayesian statistics using computational meth

Allen Downey 1.5k Jan 08, 2023
Compare outputs between layers written in Tensorflow and layers written in Pytorch

Compare outputs of Wasserstein GANs between TensorFlow vs Pytorch This is our testing module for the implementation of improved WGAN in Pytorch Prereq

Hung Nguyen 72 Dec 20, 2022
Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021)

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

912 Jan 08, 2023
A tiny, pedagogical neural network library with a pytorch-like API.

candl A tiny, pedagogical implementation of a neural network library with a pytorch-like API. The primary use of this library is for education. Use th

Sri Pranav 3 May 23, 2022
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 07, 2023
1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

Lihe Yang 209 Jan 01, 2023
Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Reinforcement Learning with Learned Fourier Features State-space Soft Actor-Critic Experiments Move to the state-SAC-LFF repository. cd state-SAC-LFF

Alex Li 10 Nov 11, 2022