Basics of 2D and 3D Human Pose Estimation.

Overview

Human Pose Estimation 101

If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolved, check out these articles I published on 2D Pose Estimation and 3D Pose Estimation

Table of Contents

Basics

  • Defined as the problem of localization of human joints (or) keypoints
  • A rigid body consists of joints and rigid parts. A body with strong articulation is a body with strong contortion.
  • Pose Estimation is the search for a specific pose in space of all articulated poses
  • Number of keypoints varies with dataset - LSP has 14, MPII has 16, 16 are used in Human3.6m
  • Classifed into 2D and 3D Pose Estimation
    • 2D Pose Estimation
    • Estimate a 2D pose (x,y) coordinates for each joint in pixel space from a RGB image
    • 3D Pose Estimation
    • Estimate a 3D pose (x,y,z) coordinates in metric space from a RGB image, or in previous works, data from a RGB-D sensor. (However, research in the past few years is heavily focussed on generating 3D poses from 2D images / 2D videos)

Loss

  • Most commonly used loss function - Mean Squared Error, MSE(Least Squares Loss)
  • This is a regression problem. The model will try to regress to the the correct coordinates, i.e move to the ground truth coordinatate’s in small increments. The model is trained to output continuous coordinates using a Mean Squared Error loss function

Evaluation metrics

Percentage of Correct Parts - PCP

  • A limb is considered detected and a correct part if the distance between the two predicted joint locations and the true limb joint locations is at most half of the limb length (PCP at 0.5 )
  • Measures detection rate of limbs
  • Cons - penalizes shorter limbs
  • Calculation
    • For a specific part, PCP = (No. of correct parts for entire dataset) / (No. of total parts for entire dataset)
    • Take a dataset with 10 images and 1 pose per image. Each pose has 8 parts - ( upper arm, lower arm, upper leg, lower leg ) x2
    • No of upper arms = 10 * 2 = 20
    • No of lower arms = 20
    • No of lower legs = No of upper legs = 20
    • If upper arm is detected correct for 17 out of the 20 upper arms i.e 17 ( 10 right arms and 7 left) → PCP = 17/20 = 85%
  • Higher the better

Percentage of Correct Key-points - PCK

  • Detected joint is considered correct if the distance between the predicted and the true joint is within a certain threshold (threshold varies)
  • [email protected] is when the threshold = 50% of the head bone link
  • [email protected] == Distance between predicted and true joint < 0.2 * torso diameter
  • Sometimes 150 mm is taken as the threshold
  • Head, shoulder, Elbow, Wrist, Hip, Knee, Ankle → Keypoints
  • PCK is used for 2D and 3D (PCK3D)
  • Higher the better

Percentage of Detected Joints - PDJ

  • Detected joint is considered correct if the distance between the predicted and the true joint is within a certain fraction of the torso diameter
  • Alleviates the shorter limb problem since shorter limbs have smaller torsos
  • PDJ at 0.2 → Distance between predicted and true join < 0.2 * torso diameter
  • Typically used for 2D Pose Estimation
  • Higher the better

Mean Per Joint Position Error - MPJPE

  • Per joint position error = Euclidean distance between ground truth and prediction for a joint
  • Mean per joint position error = Mean of per joint position error for all k joints (Typically, k = 16)
  • Calculated after aligning the root joints (typically the pelvis) of the estimated and groundtruth 3D pose.
  • PA MPJPE
    • Procrustes analysis MPJPE.
    • MPJPE calculated after the estimated 3D pose is aligned to the groundtruth by the Procrustes method
    • Procrustes method is simply a similarity transformation
  • Lower the better
  • Used for 3D Pose Estimation

AUC

Important Applications

  • Activity Analysis
  • Human-Computer Interaction (HCI)
  • Virtual Reality
  • Augmented Reality
  • Amazon Go presents an important domain for the application of Human Pose Estimation. Cameras track and recognize people and their actions, for which Pose Estimation is an important component. Entities relying on services that track and measure human activities rely heavily on human Pose Estimation

Informative roadmap on 2D Human Pose Estimation research

Owner
Sudharshan Chandra Babu
Machine Learning Engineer
Sudharshan Chandra Babu
Dynamic Realtime Animation Control

Our project is targeted at making an application that dynamically detects the user’s expressions and gestures and projects it onto an animation software which then renders a 2D/3D animation realtime

Harsh Avinash 10 Aug 01, 2022
NeWT: Natural World Tasks

NeWT: Natural World Tasks This repository contains resources for working with the NeWT dataset. ❗ At this time the binary tasks are not publicly avail

Visipedia 26 Oct 18, 2022
Multi-Joint dynamics with Contact. A general purpose physics simulator.

MuJoCo Physics MuJoCo stands for Multi-Joint dynamics with Contact. It is a general purpose physics engine that aims to facilitate research and develo

DeepMind 5.2k Jan 02, 2023
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation

PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd

SUN Group @ UMN 26 Nov 16, 2022
Grammar Induction using a Template Tree Approach

Gitta Gitta ("Grammar Induction using a Template Tree Approach") is a method for inducing context-free grammars. It performs particularly well on data

Thomas Winters 36 Nov 15, 2022
SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

The Spang Lab 4 Nov 30, 2022
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond

Insight Data Science Lab 64 Nov 15, 2022
Language models are open knowledge graphs ( non official implementation )

language-models-are-knowledge-graphs-pytorch Language models are open knowledge graphs ( work in progress ) A non official reimplementation of Languag

theblackcat102 132 Dec 18, 2022
Playing around with FastAPI and streamlit to create a YoloV5 object detector

FastAPI-Streamlit-based-YoloV5-detector Playing around with FastAPI and streamlit to create a YoloV5 object detector It turns out that a User Interfac

2 Jan 20, 2022
PyTorch implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy

Anomaly Transformer in PyTorch This is an implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. This pape

spencerbraun 160 Dec 19, 2022
NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

[Official] FINE Samples for Learning with Noisy Labels This repository is the official implementation of "FINE Samples for Learning with Noisy Labels"

mythbuster 27 Dec 23, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
A PyTorch toolkit for 2D Human Pose Estimation.

PyTorch-Pose PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface

Wei Yang 1.1k Dec 30, 2022
Convert game ISO and archives to CD CHD for emulation on Linux.

tochd Convert game ISO and archives to CD CHD for emulation. Author: Tuncay D. Source: https://github.com/thingsiplay/tochd Releases: https://github.c

Tuncay 20 Jan 02, 2023
Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

Renzhe Xu 6 Oct 26, 2022
Demo code for ICCV 2021 paper "Sensor-Guided Optical Flow"

Sensor-Guided Optical Flow Demo code for "Sensor-Guided Optical Flow", ICCV 2021 This code is provided to replicate results with flow hints obtained f

10 Mar 16, 2022
Library for converting from RGB / GrayScale image to base64 and back.

Library for converting RGB / Grayscale numpy images from to base64 and back. Installation pip install -U image_to_base_64 Conversion RGB to base 64 b

Vladimir Iglovikov 16 Aug 28, 2022
Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems

AequeVox Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems README under development. Python Packages Required

Sai Sathiesh 2 Aug 28, 2022
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

FOSSASIA 9.4k Jan 07, 2023
An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

MixHop and N-GCN ⠀ A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019)

Benedek Rozemberczki 393 Dec 13, 2022