Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

Overview

News!

  • Aug 2020: v0.4.0 version of AlphaPose is released! Stronger tracking! Include whole body(face,hand,foot) keypoints! Colab now available.
  • Dec 2019: v0.3.0 version of AlphaPose is released! Smaller model, higher accuracy!
  • Apr 2019: MXNet version of AlphaPose is released! It runs at 23 fps on COCO validation set.
  • Feb 2019: CrowdPose is integrated into AlphaPose Now!
  • Dec 2018: General version of PoseFlow is released! 3X Faster and support pose tracking results visualization!
  • Sep 2018: v0.2.0 version of AlphaPose is released! It runs at 20 fps on COCO validation set (4.6 people per image on average) and achieves 71 mAP!

AlphaPose

AlphaPose is an accurate multi-person pose estimator, which is the first open-source system that achieves 70+ mAP (75 mAP) on COCO dataset and 80+ mAP (82.1 mAP) on MPII dataset. To match poses that correspond to the same person across frames, we also provide an efficient online pose tracker called Pose Flow. It is the first open-source online pose tracker that achieves both 60+ mAP (66.5 mAP) and 50+ MOTA (58.3 MOTA) on PoseTrack Challenge dataset.

AlphaPose supports both Linux and Windows!


COCO 17 keypoints

Halpe 26 keypoints + tracking

Halpe 136 keypoints + tracking

Results

Pose Estimation

Results on COCO test-dev 2015:

Method AP @0.5:0.95 AP @0.5 AP @0.75 AP medium AP large
OpenPose (CMU-Pose) 61.8 84.9 67.5 57.1 68.2
Detectron (Mask R-CNN) 67.0 88.0 73.1 62.2 75.6
AlphaPose 73.3 89.2 79.1 69.0 78.6

Results on MPII full test set:

Method Head Shoulder Elbow Wrist Hip Knee Ankle Ave
OpenPose (CMU-Pose) 91.2 87.6 77.7 66.8 75.4 68.9 61.7 75.6
Newell & Deng 92.1 89.3 78.9 69.8 76.2 71.6 64.7 77.5
AlphaPose 91.3 90.5 84.0 76.4 80.3 79.9 72.4 82.1

More results and models are available in the docs/MODEL_ZOO.md.

Pose Tracking

Please read trackers/README.md for details.

CrowdPose

Please read docs/CrowdPose.md for details.

Installation

Please check out docs/INSTALL.md

Model Zoo

Please check out docs/MODEL_ZOO.md

Quick Start

  • Colab: We provide a colab example for your quick start.

  • Inference: Inference demo

./scripts/inference.sh ${CONFIG} ${CHECKPOINT} ${VIDEO_NAME} # ${OUTPUT_DIR}, optional

For high level API, please refer to ./scripts/demo_api.py

  • Training: Train from scratch
./scripts/train.sh ${CONFIG} ${EXP_ID}
  • Validation: Validate your model on MSCOCO val2017
./scripts/validate.sh ${CONFIG} ${CHECKPOINT}

Examples:

Demo using FastPose model.

./scripts/inference.sh configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml pretrained_models/fast_res50_256x192.pth ${VIDEO_NAME}
#or
python scripts/demo_inference.py --cfg configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/fast_res50_256x192.pth --indir examples/demo/

Train FastPose on mscoco dataset.

./scripts/train.sh ./configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml exp_fastpose

More detailed inference options and examples, please refer to GETTING_STARTED.md

Common issue & FAQ

Check out faq.md for faq. If it can not solve your problems or if you find any bugs, don't hesitate to comment on GitHub or make a pull request!

Contributors

AlphaPose is based on RMPE(ICCV'17), authored by Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai and Cewu Lu, Cewu Lu is the corresponding author. Currently, it is maintained by Jiefeng Li*, Hao-shu Fang*, Yuliang Xiu and Chao Xu.

The main contributors are listed in doc/contributors.md.

TODO

  • Multi-GPU/CPU inference
  • 3D pose
  • add tracking flag
  • PyTorch C++ version
  • Add MPII and AIC data
  • dense support
  • small box easy filter
  • Crowdpose support
  • Speed up PoseFlow
  • Add stronger/light detectors and the mobile pose
  • High level API

We would really appreciate if you can offer any help and be the contributor of AlphaPose.

Citation

Please cite these papers in your publications if it helps your research:

@inproceedings{fang2017rmpe,
  title={{RMPE}: Regional Multi-person Pose Estimation},
  author={Fang, Hao-Shu and Xie, Shuqin and Tai, Yu-Wing and Lu, Cewu},
  booktitle={ICCV},
  year={2017}
}

@article{li2018crowdpose,
  title={CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark},
  author={Li, Jiefeng and Wang, Can and Zhu, Hao and Mao, Yihuan and Fang, Hao-Shu and Lu, Cewu},
  journal={arXiv preprint arXiv:1812.00324},
  year={2018}
}

@inproceedings{xiu2018poseflow,
  author = {Xiu, Yuliang and Li, Jiefeng and Wang, Haoyu and Fang, Yinghong and Lu, Cewu},
  title = {{Pose Flow}: Efficient Online Pose Tracking},
  booktitle={BMVC},
  year = {2018}
}

License

AlphaPose is freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an e-mail at mvig.alphapose[at]gmail[dot]com and cc lucewu[[at]sjtu[dot]edu[dot]cn. We will send the detail agreement to you.

Owner
Machine Vision and Intelligence Group @ SJTU
Machine Vision and Intelligence Group @ SJTU
Multiview Dataset Toolkit

Multiview Dataset Toolkit Using multi-view cameras is a natural way to obtain a complete point cloud. However, there is to date only one multi-view 3D

11 Dec 22, 2022
A Loss Function for Generative Neural Networks Based on Watson’s Perceptual Model

This repository contains the similarity metrics designed and evaluated in the paper, and instructions and code to re-run the experiments. Implementation in the deep-learning framework PyTorch

Steffen 86 Dec 27, 2022
Personals scripts using ageitgey/face_recognition

HOW TO USE pip3 install requirements.txt Add some pictures of known people in the folder 'people' : a) Create a folder called by the name of the perso

Antoine Bollengier 1 Jan 06, 2022
Open-source implementation of Google Vizier for hyper parameters tuning

Advisor Introduction Advisor is the hyper parameters tuning system for black box optimization. It is the open-source implementation of Google Vizier w

tobe 1.5k Jan 04, 2023
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
Recommendationsystem - Movie-recommendation - matrixfactorization colloborative filtering recommendation system user

recommendationsystem matrixfactorization colloborative filtering recommendation

kunal jagdish madavi 1 Jan 01, 2022
COVID-Net Open Source Initiative

The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available

Linda Wang 1.1k Dec 26, 2022
Benchmark tools for Compressive LiDAR-to-map registration

Benchmark tools for Compressive LiDAR-to-map registration This repo contains the released version of code and datasets used for our IROS 2021 paper: "

Allie 9 Nov 24, 2022
Deep Networks with Recurrent Layer Aggregation

RLA-Net: Recurrent Layer Aggregation Recurrence along Depth: Deep Networks with Recurrent Layer Aggregation This is an implementation of RLA-Net (acce

Joy Fang 21 Aug 16, 2022
Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks Contributions A novel pairwise feature LSP to extract structural

31 Dec 06, 2022
Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure.

Event Queue Dialect Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure. Motivation The m

Cornell Capra 23 Dec 08, 2022
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Xiangyin Kong 7 Nov 08, 2022
Repo for the Tutorials of Day1-Day3 of the Nordic Probabilistic AI School 2021 (https://probabilistic.ai/)

ProbAI 2021 - Probabilistic Programming and Variational Inference Tutorial with Pryo Day 1 (June 14) Slides Notebook: students_PPLs_Intro Notebook: so

PGM-Lab 46 Nov 01, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
A tool to analyze leveraged liquidity mining and find optimal option combination for hedging.

LP-Option-Hedging Description A Python program to analyze leveraged liquidity farming/mining and find the optimal option combination for hedging imper

Aureliano 18 Dec 19, 2022
Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

Holy Wu 11 Jun 19, 2022
LocUNet is a deep learning method to localize a UE based solely on the reported signal strengths from a set of BSs.

LocUNet LocUNet is a deep learning method to localize a UE based solely on the reported signal strengths from a set of BSs. The method utilizes accura

4 Oct 05, 2022
Spatial Transformer Nets in TensorFlow/ TensorLayer

MOVED TO HERE Spatial Transformer Networks Spatial Transformer Networks (STN) is a dynamic mechanism that produces transformations of input images (or

Hao 36 Nov 23, 2022