A Pytorch implementation of MoveNet from Google. Include training code and pre-train model.

Overview

Movenet.Pytorch

license

Intro

start

MoveNet is an ultra fast and accurate model that detects 17 keypoints of a body. This is A Pytorch implementation of MoveNet from Google. Include training code and pre-train model.

Google just release pre-train models(tfjs or tflite), which cannot be converted to some CPU inference framework such as NCNN,Tengine,MNN,TNN, and we can not add our own custom data to finetune, so there is this repo.

How To Run

1.Download COCO dataset2017 from https://cocodataset.org/. (You need train2017.zip, val2017.zip and annotations.)Unzip to movenet.pytorch/data/ like this:

├── data
    ├── annotations (person_keypoints_train2017.json, person_keypoints_val2017.json, ...)
    ├── train2017   (xx.jpg, xx.jpg,...)
    └── val2017     (xx.jpg, xx.jpg,...)

2.Make data to our data format.

python scripts/make_coco_data_17keypooints.py
Our data format: JSON file
Keypoints order:['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 
    'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 
    'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 
    'right_ankle']

One item:
[{"img_name": "0.jpg",
  "keypoints": [x0,y0,z0,x1,y1,z1,...],
  #z: 0 for no label, 1 for labeled but invisible, 2 for labeled and visible
  "center": [x,y],
  "bbox":[x0,y0,x1,y1],
  "other_centers": [[x0,y0],[x1,y1],...],
  "other_keypoints": [[[x0,y0],[x1,y1],...],[[x0,y0],[x1,y1],...],...], #lenth = num_keypoints
 },
 ...
]

3.You can add your own data to the same format.

4.After putting data at right place, you can start training

python train.py

5.After training finished, you need to change the test model path to test. Such as this in predict.py

run_task.modelLoad("output/xxx.pth")

6.run predict to show predict result, or run evaluate.py to compute my acc on test dataset.

python predict.py

7.Convert to onnx.

python pth2onnx.py

Training Results

Some good samples

good

Some bad cases

bad

Tips to improve

1. Focus on data

  • Add COCO2014. (But as I know it has some duplicate data of COCO2017, and I don't know if google use it.)
  • Clean the croped COCO2017 data. (Some img just have little points, such as big face, big body,etc.MoveNet is a small network, COCO data is a little hard for it.)
  • Add some yoga, fitness, and dance videos frame from YouTube. (Highly Recommened! Cause Google did this on their Movenet and said 'Evaluations on the Active validation dataset show a significant performance boost relative to identical architectures trained using only COCO. ')

2. Change backbone

Try to ransfer Mobilenetv2(original Movenet) to Mobilenetv3 or Shufflenetv2 may get a litte improvement.If you just wanna reproduce the original Movenet, u can ignore this.

3. More fancy loss

Surely this is a muti-task learning. So add some loss to learn together may improve the performence. (Such as BoneLoss which I have added.) And we can never know how Google trained, cause we cannot see it from the pre-train tflite model file, so you can try any loss function you like.

4. Data Again

I just wanna you know the importance of the data. The more time you spend on clean data and add new data, the better performance your model will get! (While tips 2 and 3 may not.)

Resource

  1. Blog:Next-Generation Pose Detection with MoveNet and TensorFlow.js
  2. model card
  3. TFHub:movenet/singlepose/lightning
  4. My article share: 2021轻量级人体姿态估计模型修炼之路(附谷歌MoveNet复现经验)
Owner
Mr.Fire
Mr.Fire
Attendance Monitoring with Face Recognition using Python

Attendance Monitoring with Face Recognition using Python A python GUI integrated attendance system using face recognition to take attendance. In this

Vaibhav Rajput 2 Jun 21, 2022
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

Breaking the Curse of Space Explosion: Towards Effcient NAS with Curriculum Search Pytorch implementation for "Breaking the Curse of Space Explosion:

guoyong 17 Jan 03, 2023
Experiments for Operating Systems Lab (ETCS-352)

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

Deekshant Wadhwa 0 Sep 06, 2022
Segmentation Training Pipeline

Segmentation Training Pipeline This package is a part of Musket ML framework. Reasons to use Segmentation Pipeline Segmentation Pipeline was developed

Musket ML 52 Dec 12, 2022
This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

Observe then Incentivize Experiments This is the code used for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents",

Cong Shen Research Group 0 Mar 08, 2022
FedGS: A Federated Group Synchronization Framework Implemented by LEAF-MX.

FedGS: Data Heterogeneity-Robust Federated Learning via Group Client Selection in Industrial IoT Preparation For instructions on generating data, plea

Lizonghang 9 Dec 22, 2022
Supervised Classification from Text (P)

MSc-Thesis Module: Masters Research Thesis Language: Python Grade: 75 Title: An investigation of supervised classification of therapeutic process from

Matthew Laws 1 Nov 22, 2021
Code for Max-Margin Contrastive Learning - AAAI 2022

Max-Margin Contrastive Learning This is a pytorch implementation for the paper Max-Margin Contrastive Learning accepted to AAAI 2022. This repository

Anshul Shah 12 Oct 22, 2022
Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Probabilistic Tensor Decomposition of Neural Population Spiking Activity Matlab (recommended) and Python (in developement) implementations of Soulat e

Hugo Soulat 6 Nov 30, 2022
RealTime Emotion Recognizer for Machine Learning Study Jam's demo

Emotion recognizer Table of contents Clone project Dataset Install dependencies Main program Demo 1. Clone project git clone https://github.com/GDSC20

Google Developer Student Club - UIT 1 Oct 05, 2021
Few-Shot Object Detection via Association and DIscrimination

Few-Shot Object Detection via Association and DIscrimination Code release of our NeurIPS 2021 paper: Few-Shot Object Detection via Association and DIs

Cao Yuhang 49 Dec 18, 2022
HomoInterpGAN - Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation

HomoInterpGAN Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation (CVPR 2019, oral) Installation The implementation is base

Ying-Cong Chen 99 Nov 15, 2022
Hydra Lightning Template for Structured Configs

Hydra Lightning Template for Structured Configs Template for creating projects with pytorch-lightning and hydra. How to use this template? Create your

Model-driven Machine Learning 4 Jul 19, 2022
Bayesian regularization for functional graphical models.

BayesFGM Paper: Jiajing Niu, Andrew Brown. Bayesian regularization for functional graphical models. Requirements R version 3.6.3 and up Python 3.6 and

0 Oct 07, 2021
Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

Amol 8 Sep 05, 2022
Nvidia Semantic Segmentation monorepo

Paper | YouTube | Cityscapes Score Pytorch implementation of our paper Hierarchical Multi-Scale Attention for Semantic Segmentation. Please refer to t

NVIDIA Corporation 1.6k Jan 04, 2023
Convert human motion from video to .bvh

video_to_bvh Convert human motion from video to .bvh with Google Colab Usage 1. Open video_to_bvh.ipynb in Google Colab Go to https://colab.research.g

Dene 306 Dec 10, 2022
Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR 2018).

Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR2018) By Zilong Huang, Xinggang Wang, Jiasi Wang, Wenyu Liu and J

Zilong Huang 245 Dec 13, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

Chonghan_Lee 15 Nov 02, 2022