KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

Overview

KAPAO (Keypoints and Poses as Objects)

KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as objects within a dense anchor-based detection framework. When not using test-time augmentation (TTA), KAPAO is much faster and more accurate than previous single-stage methods like DEKR and HigherHRNet:

alt text

This repository contains the official PyTorch implementation for the paper:
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation.

Our code was forked from ultralytics/yolov5 at commit 5487451.

Setup

  1. If you haven't already, install Anaconda or Miniconda.
  2. Create a new conda environment with Python 3.6: $ conda create -n kapao python=3.6.
  3. Activate the environment: $ conda activate kapao
  4. Clone this repo: $ git clone https://github.com/wmcnally/kapao.git
  5. Install the dependencies: $ cd kapao && pip install -r requirements.txt
  6. Download the trained models: $ sh data/scripts/download_models.sh

Inference Demos

Note: FPS calculations includes all processing, including inference, plotting / tracking, image resizing, etc. See demo script arguments for inference options.

Flash Mob Demo

This demo runs inference on a 720p dance video (native frame-rate of 25 FPS).

alt text

To display the inference results in real-time:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --start 188 --end 196 --gif --fps

Squash Demo

This demo runs inference on a 1080p slow motion squash video (native frame-rate of 25 FPS). It uses a simple player tracking algorithm based on the frame-to-frame pose differences.

alt text

To display the inference results in real-time:
$ python demos/squash.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/squash.py --weights kapao_s_coco.pt --start 42 --end 50 --gif --fps

COCO Experiments

Download the COCO dataset: $ sh data/scripts/get_coco_kp.sh

Validation (without TTA)

  • KAPAO-S (63.0 AP): $ python val.py --rect
  • KAPAO-M (68.5 AP): $ python val.py --rect --weights kapao_m_coco.pt
  • KAPAO-L (70.6 AP): $ python val.py --rect --weights kapao_l_coco.pt

Validation (with TTA)

  • KAPAO-S (64.3 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (69.6 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (71.6 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1

Testing

  • KAPAO-S (63.8 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-M (68.8 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-L (70.3 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/s_e500 \
--name train \
--workers 128

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/m_e500 \
--name train \
--workers 128

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/l_e500 \
--name train \
--workers 128

Note: DDP is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue.

CrowdPose Experiments

  • Install the CrowdPose API to your conda environment:
    $ cd .. && git clone https://github.com/Jeff-sjtu/CrowdPose.git
    $ cd CrowdPose/crowdpose-api/PythonAPI && sh install.sh && cd ../../../kapao
  • Download the CrowdPose dataset: $ sh data/scripts/get_crowdpose.sh

Testing

  • KAPAO-S (63.8 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (67.1 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_m_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (68.9 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_l_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. Training was performed on the trainval split with no validation. The test results above were generated using the last model checkpoint.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/cp_s_e300 \
--name train \
--workers 128 \
--noval

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 300 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/cp_m_e300 \
--name train \
--workers 128 \
--noval

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/cp_l_e300 \
--name train \
--workers 128 \
--noval

Acknowledgements

This work was supported in part by Compute Canada, the Canada Research Chairs Program, the Natural Sciences and Engineering Research Council of Canada, a Microsoft Azure Grant, and an NVIDIA Hardware Grant.

If you find this repo is helpful in your research, please cite our paper:

@article{mcnally2021kapao,
  title={Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={arXiv preprint arXiv:2111.08557},
  year={2021}
}

Please also consider citing our previous works:

@inproceedings{mcnally2021deepdarts,
  title={DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera},
  author={McNally, William and Walters, Pascale and Vats, Kanav and Wong, Alexander and McPhee, John},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4547--4556},
  year={2021}
}

@article{mcnally2021evopose2d,
  title={EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation Using Accelerated Neuroevolution With Weight Transfer},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={IEEE Access},
  volume={9},
  pages={139403--139414},
  year={2021},
  publisher={IEEE}
}
Owner
Will McNally
PhD Candidate
Will McNally
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

62 Dec 22, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Keras-1D-NN-Classifier

Keras-1D-NN-Classifier This code is based on the reference codes linked below. reference 1, reference 2 This code is for 1-D array data classification

Jae-Hoon Shim 6 May 18, 2021
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.3k Dec 30, 2022
PyTorch implementation of TSception V2 using DEAP dataset

TSception This is the PyTorch implementation of TSception V2 using DEAP dataset in our paper: Yi Ding, Neethu Robinson, Su Zhang, Qiuhao Zeng, Cuntai

Yi Ding 27 Dec 15, 2022
Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

Akbar Karimi 81 Dec 09, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

15 Nov 18, 2022
Vit-ImageClassification - Pytorch ViT for Image classification on the CIFAR10 dataset

Vit-ImageClassification Introduction This project uses ViT to perform image clas

Kaicheng Yang 4 Jun 01, 2022
PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

WuJinxuan 144 Dec 26, 2022
Volumetric parameterization of the placenta to a flattened template

placenta-flattening A MATLAB algorithm for volumetric mesh parameterization. Developed for mapping a placenta segmentation derived from an MRI image t

Mazdak Abulnaga 12 Mar 14, 2022
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
Easy genetic ancestry predictions in Python

ezancestry Easily visualize your direct-to-consumer genetics next to 2500+ samples from the 1000 genomes project. Evaluate the performance of a custom

Kevin Arvai 38 Jan 02, 2023
Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

PyGAS: Auto-Scaling GNNs in PyG PyGAS is the practical realization of our G NN A uto S cale (GAS) framework, which scales arbitrary message-passing GN

Matthias Fey 139 Dec 25, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022
Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

Astitva Veer Garg 1 Jan 13, 2022
Rust bindings for the C++ api of PyTorch.

tch-rs Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorc

Laurent Mazare 2.3k Dec 30, 2022
Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

Junho Kim 26 Nov 18, 2022
Python scripts using the Mediapipe models for Halloween.

Mediapipe-Halloween-Examples Python scripts using the Mediapipe models for Halloween. WHY Mainly for fun. But this repository also includes useful exa

Ibai Gorordo 23 Jan 06, 2023
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022