Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

Overview

SimplePose

Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, accepted by AAAI-2020.

Also this repo serves as the Part B of our paper "Multi-Person Pose Estimation Based on Gaussian Response Heatmaps" (under review). The Part A is available at this link.

  • Update

    A faster project is to be released.

Introduction

A bottom-up approach for the problem of multi-person pose estimation.

heatmap

network

Contents

  1. Training
  2. Evaluation
  3. Demo

Project Features

  • Implement the models using Pytorch in auto mixed-precision (using Nvidia Apex).
  • Support training on multiple GPUs (over 90% GPU usage rate on each GPU card).
  • Fast data preparing and augmentation during training (generating about 40 samples per second on signle CPU process and much more if wrapped by DataLoader Class).
  • Focal L2 loss. FL2
  • Multi-scale supervision.
  • This project can also serve as a detailed practice to the green hand in Pytorch.

Prepare

  1. Install packages:

    Python=3.6, Pytorch>1.0, Nvidia Apex and other packages needed.

  2. Download the COCO dataset.

  3. Download the pre-trained models (default configuration: download the pretrained model snapshotted at epoch 52 provided as follow).

    Download Link: BaiduCloud

    Alternatively, download the pre-trained model without optimizer checkpoint only for the default configuration via GoogleDrive

  4. Change the paths in the code according to your environment.

Run a Demo

python demo_image.py

examples

Inference Speed

The speed of our system is tested on the MS-COCO test-dev dataset.

  • Inference speed of our 4-stage IMHN with 512 × 512 input on one 2080TI GPU: 38.5 FPS (100% GPU-Util).
  • Processing speed of the keypoint assignment algorithm part that is implemented in pure Python and a single process on Intel Xeon E5-2620 CPU: 5.2 FPS (has not been well accelerated).

Evaluation Steps

The corresponding code is in pure python without multiprocess for now.

python evaluate.py

Results on MSCOCO 2017 test-dev subset (focal L2 loss with gamma=2):

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.685
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.867
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.749
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.664
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.719
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.728
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.892
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.782
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.688
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.784

Training Steps

Before training, prepare the training data using ''SimplePose/data/coco_masks_hdf5.py''.

Multiple GUPs are recommended to use to speed up the training process, but we support different training options.

  • Most code has been provided already, you can train the model with.

    1. 'train.py': single training process on one GPU only.
    2. 'train_parallel.py': signle training process on multiple GPUs using Dataparallel.
    3. 'train_distributed.py' (recommended): multiple training processes on multiple GPUs using Distributed Training:
python -m torch.distributed.launch --nproc_per_node=4 train_distributed.py

Note: The loss_model_parrel.py is for train.py and train_parallel.py, while the loss_model.py is for train_distributed.py and train_distributed_SWA.py. They are different in dividing the batch size. Please refer to the code about the different choices.

For distributed training, the real batch_size = batch_size_in_config* × GPU_Num (world_size actually). For others, the real batch_size = batch_size_in_config*. The differences come from the different mechanisms of data parallel training and distributed training.

Referred Repositories (mainly)

Recommend Repositories

Faster Version: Chun-Ming Su has rebuilt and improved the post-processing speed of this repo using C++, and the improved system can run up to 7~8 FPS using a single scale with flipping on a 2080 TI GPU. Many thanks to Chun-Ming Su.

Citation

Please kindly cite this paper in your publications if it helps your research.

@inproceedings{li2020simple,
  title={Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation.},
  author={Li, Jia and Su, Wen and Wang, Zengfu},
  booktitle={AAAI},
  pages={11354--11361},
  year={2020}
}
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

deargen 11 Nov 19, 2022
Code repository for our paper "Learning to Generate Scene Graph from Natural Language Supervision" in ICCV 2021

Scene Graph Generation from Natural Language Supervision This repository includes the Pytorch code for our paper "Learning to Generate Scene Graph fro

Yiwu Zhong 64 Dec 24, 2022
An investigation project for SISR.

SISR-Survey An investigation project for SISR. This repository is an official project of the paper "From Beginner to Master: A Survey for Deep Learnin

Juncheng Li 79 Oct 20, 2022
The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation(ICPR 2020) Overview This code is for the paper: Spatial Attention U-Net for Retinal V

Changlu Guo 151 Dec 28, 2022
Sharing of contents on mitochondrial encounter networks

mito-network-sharing Sharing of contents on mitochondrial encounter networks Required: R with igraph, brainGraph, ggplot2, and XML libraries; igraph l

Stochastic Biology Group 0 Oct 01, 2021
Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

Official-PyTorch-Implementation-of-TransMEF Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fu

117 Dec 27, 2022
MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

MNE-Python MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, E

MNE tools for MEG and EEG data analysis 2.1k Dec 28, 2022
Annealed Flow Transport Monte Carlo

Annealed Flow Transport Monte Carlo Open source implementation accompanying ICML 2021 paper by Michael Arbel*, Alexander G. D. G. Matthews* and Arnaud

DeepMind 30 Nov 21, 2022
Storage-optimizer - Identify potintial optimizations on the cloud storage accounts

Storage Optimizer Identify potintial optimizations on the cloud storage accounts

Zaher Mousa 1 Feb 13, 2022
g9.py - Torch interactive graphics

g9.py - Torch interactive graphics A Torch toy in the browser. Demo at https://srush.github.io/g9py/ This is a shameless copy of g9.js, written in Pyt

Sasha Rush 13 Nov 16, 2022
Wav2Vec for speech recognition, classification, and audio classification

Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your

Mehrdad Farahani 140 Dec 15, 2022
PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

Asym-Siam: On the Importance of Asymmetry for Siamese Representation Learning This is a PyTorch implementation of the Asym-Siam paper, CVPR 2022: @inp

Meta Research 89 Dec 18, 2022
NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.

#NeuralTalk Warning: Deprecated. Hi there, this code is now quite old and inefficient, and now deprecated. I am leaving it on Github for educational p

Andrej 5.3k Jan 07, 2023
[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

OpenCOOD OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV

Runsheng Xu 322 Dec 23, 2022
Generate Cartoon Images using Generative Adversarial Network

AvatarGAN ✨ Generate Cartoon Images using DC-GAN Deep Convolutional GAN is a generative adversarial network architecture. It uses a couple of guidelin

Aakash Jhawar 50 Dec 29, 2022
Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Diverse Object-Scene Compositions For Zero-Shot Action Recognition This repository contains the source code for the use of object-scene compositions f

7 Sep 21, 2022
A Haskell kernel for IPython.

IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur

Andrew Gibiansky 2.4k Dec 29, 2022
Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019) This repository contains python (3.5.2) implementation of

Doyup Lee 222 Dec 21, 2022
A Python reference implementation of the CF data model

cfdm A Python reference implementation of the CF data model. References Compliance with FAIR principles Documentation https://ncas-cms.github.io/cfdm

NCAS CMS 25 Dec 13, 2022
Code for paper Novel View Synthesis via Depth-guided Skip Connections

Novel View Synthesis via Depth-guided Skip Connections Code for paper Novel View Synthesis via Depth-guided Skip Connections @InProceedings{Hou_2021_W

8 Mar 14, 2022