Bottom-up Human Pose Estimation

Overview

Introduction

This is the official code of Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation. This paper has been accepted to CVPR2021.

This repo is built on Bottom-up-Higher-HRNet.

Main Results

Results on COCO val2017 without multi-scale test

Method Backbone Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L)
HigherHRNet HRNet-w32 512 28.6M 47.9 67.1 86.2 73.0 61.5 76.1
HigherHRNet + SWAHR HRNet-w32 512 28.6M 48.0 68.9 87.8 74.9 63.0 77.4
HigherHRNet HRNet-w48 640 63.8M 154.3 69.9 87.2 76.1 65.4 76.4
HigherHRNet + SWAHR HRNet-w48 640 63.8M 154.6 70.8 88.5 76.8 66.3 77.4

Results on COCO val2017 with multi-scale test

Method Backbone Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L)
HigherHRNet HRNet-w32 512 28.6M 47.9 69.9 87.1 76.0 65.3 77.0
HigherHRNet + SWAHR HRNet-w32 512 28.6M 48.0 71.4 88.9 77.8 66.3 78.9
HigherHRNet HRNet-w48 640 63.8M 154.3 72.1 88.4 78.2 67.8 78.3
HigherHRNet + SWAHR HRNet-w48 640 63.8M 154.6 73.2 89.8 79.1 69.1 79.3

Results on COCO test-dev2017 without multi-scale test

Method Backbone Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L)
OpenPose* - - - - 61.8 84.9 67.5 57.1 68.2
Hourglass Hourglass 512 277.8M 206.9 56.6 81.8 61.8 49.8 67.0
PersonLab ResNet-152 1401 68.7M 405.5 66.5 88.0 72.6 62.4 72.3
PifPaf - - - - 66.7 - - 62.4 72.9
Bottom-up HRNet HRNet-w32 512 28.5M 38.9 64.1 86.3 70.4 57.4 73.9
HigherHRNet HRNet-w32 512 28.6M 47.9 66.4 87.5 72.8 61.2 74.2
HigherHRNet + SWAHR HRNet-w32 512 28.6M 48.0 67.9 88.9 74.5 62.4 75.5
HigherHRNet HRNet-w48 640 63.8M 154.3 68.4 88.2 75.1 64.4 74.2
HigherHRNet + SWAHR HRNet-w48 640 63.8M 154.6 70.2 89.9 76.9 65.2 77.0

Results on COCO test-dev2017 with multi-scale test

Method Backbone Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L)
Hourglass Hourglass 512 277.8M 206.9 63.0 85.7 68.9 58.0 70.4
Hourglass* Hourglass 512 277.8M 206.9 65.5 86.8 72.3 60.6 72.6
PersonLab ResNet-152 1401 68.7M 405.5 68.7 89.0 75.4 64.1 75.5
HigherHRNet HRNet-w48 640 63.8M 154.3 70.5 89.3 77.2 66.6 75.8
HigherHRNet + SWAHR HRNet-w48 640 63.8M 154.6 72.0 90.7 78.8 67.8 77.7

Results on CrowdPose test

Method AP Ap .5 AP .75 AP (E) AP (M) AP (H)
Mask-RCNN 57.2 83.5 60.3 69.4 57.9 45.8
AlphaPose 61.0 81.3 66.0 71.2 61.4 51.1
SPPE 66.0. 84.2 71.5 75.5 66.3 57.4
OpenPose - - - 62.7 48.7 32.3
HigherHRNet 65.9 86.4 70.6 73.3 66.5 57.9
HigherHRNet + SWAHR 71.6 88.5 77.6 78.9 72.4 63.0
HigherHRNet* 67.6 87.4 72.6 75.8 68.1 58.9
HigherHRNet + SWAHR* 73.8 90.5 79.9 81.2 74.7 64.7

'*' indicates multi-scale test

Installation

The details about preparing the environment and datasets can be referred to README.md.

Downlaod our pretrained weights from BaidunYun(Password: 8weh) or GoogleDrive to ./models.

Training and Testing

Testing on COCO val2017 dataset using pretrained weights

For single-scale testing:

python tools/dist_valid.py \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_higher_hrnet_w32_512.pth

By default, we use horizontal flip. To test without flip:

python tools/dist_valid.py \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_higher_hrnet_w32_512.pth \
    TEST.FLIP_TEST False

Multi-scale testing is also supported, although we do not report results in our paper:

python tools/dist_valid.py \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_higher_hrnet_w32_512.pth \
    TEST.SCALE_FACTOR '[0.5, 1.0, 2.0]'

Training on COCO train2017 dataset

python tools/dist_train.py \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml 

By default, it will use all available GPUs on the machine for training. To specify GPUs, use

CUDA_VISIBLE_DEVICES=0,1 python tools/dist_train.py \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml 

Testing on your own images

python tools/dist_inference.py \
    --img_dir path/to/your/directory/of/images \
    --save_dir path/where/results/are/saved \
    --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_higher_hrnet_w32_512.pth \
    TEST.SCALE_FACTOR '[0.5, 1.0, 2.0]'

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{LuoSWAHR,
  title={Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation},
  author={Zhengxiong Luo and Zhicheng Wang and Yan Huang and Liang Wang and Tieniu Tan and Erjin Zhou},
  booktitle={CVPR},
  year={2021}
}
DeepLab-ResNet rebuilt in TensorFlow

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Fr

Vladimir 1.2k Nov 04, 2022
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
This Deep Learning Model Predicts that from which disease you are suffering.

Deep-Learning-Project This Deep Learning Model Predicts that from which disease you are suffering. This Project Covers the Topics of Deep Learning Int

Jai Viral Doshi 0 Jan 20, 2022
Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

RuanJingqing 8 Sep 30, 2022
The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
A curated list of awesome neural radiance fields papers

Awesome Neural Radiance Fields A curated list of awesome neural radiance fields papers, inspired by awesome-computer-vision. How to submit a pull requ

Yen-Chen Lin 3.9k Dec 27, 2022
A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano

yolov5-fire-smoke-detect-python A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano You can see

20 Dec 15, 2022
Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Music Trees Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Ins

Hugo Flores García 32 Nov 22, 2022
Vehicle detection using machine learning and computer vision techniques for Udacity's Self-Driving Car Engineer Nanodegree.

Vehicle Detection Video demo Overview Vehicle detection using these machine learning and computer vision techniques. Linear SVM HOG(Histogram of Orien

hata 1.1k Dec 18, 2022
Script utilizando OpenCV e modelo Machine Learning para detectar o uso de máscaras.

Reconhecendo máscaras Este repositório contém um script em Python3 que reconhece se um rosto está ou não portando uma máscara! O código utiliza da bib

Maria Eduarda de Azevedo Silva 168 Oct 20, 2022
QMagFace: Simple and Accurate Quality-Aware Face Recognition

Quality-Aware Face Recognition 26.11.2021 start readme QMagFace: Simple and Accurate Quality-Aware Face Recognition Research Paper Implementation - To

Philipp Terhörst 59 Jan 04, 2023
This repository introduces a short project about Transfer Learning for Classification of MRI Images.

Transfer Learning for MRI Images Classification This repository introduces a short project made during my stay at Neuromatch Summer School 2021. This

Oscar Guarnizo 3 Nov 15, 2022
Navigating StyleGAN2 w latent space using CLIP

Navigating StyleGAN2 w latent space using CLIP an attempt to build sth with the official SG2-ADA Pytorch impl kinda inspired by Generating Images from

Mike K. 55 Dec 06, 2022
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022
Computer-Vision-Paper-Reviews - Computer Vision Paper Reviews with Key Summary along Papers & Codes

Computer-Vision-Paper-Reviews Computer Vision Paper Reviews with Key Summary along Papers & Codes. Jonathan Choi 2021 50+ Papers across Computer Visio

Jonathan Choi 2 Mar 17, 2022
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

InversePrompting Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting Code: The code is provided in the "chinese_ip"

THUDM 101 Dec 16, 2022
Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

Introduction Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models". In this work, we demonstrate that existi

Wei-Cheng Tseng 7 Nov 01, 2022