PyTorch reimplementation of minimal-hand (CVPR2020)

Overview

Minimal Hand Pytorch

Unofficial PyTorch reimplementation of minimal-hand (CVPR2020).

demo demo

you can also find in youtube or bilibili

This project reimplement following components :

  1. Training (DetNet) and Evaluation Code
  2. Shape Estimation
  3. Pose Estimation: Instead of IKNet in original paper, an analytical inverse kinematics method is used.

Offical project link: [minimal-hand]

Update

  • 2021/03/09 update about utils/LM.py, time cost drop from 12s/item to 1.57s/item

  • 2021/03/12 update about utils/LM.py, time cost drop from 1.57s/item to 0.27s/item

  • 2021/03/17 realtime perfomance is achieved when using PSO to estimate shape, coming soon

  • 2021/03/20 Add PSO to estimate shape. AUC is decreased by about 0.01 on STB and RHD datasets, and increased a little on EO and do datasets. Modifiy utlis/vis.py to improve realtime perfomance

  • 2021/03/24 Fixed some errors in calculating AUC. Update the 3D PCK AUC Diffenence.

Usage

  • Retrieve the code
git clone https://github.com/MengHao666/Minimal-Hand-pytorch
cd Minimal-Hand-pytorch
  • Create and activate the virtual environment with python dependencies
conda env create --file=environment.yml
conda activate minimal-hand-torch

Prepare MANO hand model

  1. Download MANO model from here and unzip it.

  2. Create an account by clicking Sign Up and provide your information

  3. Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license.

  4. unzip and copy the content of the models folder into the mano folder

  5. Your structure should look like this:

Minimal-Hand-pytorch/
   mano/
      models/
      webuser/

Download and Prepare datasets

Training dataset

Evaluation dataset

Processing

  • Create a data directory, extract all above datasets or additional materials in it

Now your data folder structure should like this:

data/

    CMU/
        hand143_panopticdb/
            datasets/
            ...
        hand_labels/
            datasets/
            ...

    RHD/
        RHD_published_v2/
            evaluation/
            training/
            view_sample.py
            ...

    GANeratedHands_Release/
        data/
        ...

    STB/
        images/
            B1Counting/
                SK_color_0.png
                SK_depth_0.png
                SK_depth_seg_0.png  <-- merged from STB_supp
                ...
            ...
        labels/
            B1Counting_BB.mat
            ...

    dexter+object/
        calibration/
        bbox_dexter+object.csv
        DO_pred_2d.npy
        data/
            Grasp1/
                annotations/
                    Grasp13D.txt
                    my_Grasp13D.txt
                    ...
                ...
            Grasp2/
                annotations/
                    Grasp23D.txt
                    my_Grasp23D.txt
                    ...
                ...
            Occlusion/
                annotations/
                    Occlusion3D.txt
                    my_Occlusion3D.txt
                    ...
                ...
            Pinch/
                annotations/
                    Pinch3D.txt
                    my_Pinch3D.txt
                    ...
                ...
            Rigid/
                annotations/
                    Rigid3D.txt
                    my_Rigid3D.txt
                    ...
                ...
            Rotate/
                                annotations/
                    Rotate3D.txt
                    my_Rotate3D.txt
                    ...
                ...
        

    EgoDexter/
        preview/
        data/
            Desk/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Fruits/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Kitchen/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Rotunda/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
        

Note

  • All code and data from these download falls under their own licenses.
  • DO represents "dexter+object" dataset; EO represents "EgoDexter" dataset
  • DO_supp and EO_supp are modified from original ones.
  • DO_pred_2d.npy are 2D predictions from 2D part of DetNet.
  • some labels of DO and EO is obviously wrong (u could find some examples with original labels from dexter_object.py or egodexter.py), when projected into image plane, thus should be omitted. Here come my_{}3D.txt and my_annotation.txt_3D.txt.

Download my Results

realtime demo

python demo.py

DetNet Training and Evaluation

Run the training code

python train_detnet.py --data_root data/

Run the evaluation code

python train_detnet.py --data_root data/  --datasets_test testset_name_to_test   --evaluate  --evaluate_id checkpoints_id_to_load 

or use my results

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "rhd" --evaluate  --evaluate_id 106

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "stb" --evaluate  --evaluate_id 71

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "do" --evaluate  --evaluate_id 68

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "eo" --evaluate  --evaluate_id 101

Shape Estimation

Run the shape optimization code. This can be very time consuming when the weight parameter is quite small.

python optimize_shape.py --weight 1e-5

or use my results

python optimize_shape.py --path my_results/out_testset/

Pose Estimation

Run the following code which uses a analytical inverse kinematics method.

python aik_pose.py

or use my results

python aik_pose.py --path my_results/out_testset/

Detnet training and evaluation curve

Run the following code to see my results

python plot.py --path my_results/out_loss_auc

(AUC means 3D PCK, and ACC_HM means 2D PCK) teaser

3D PCK AUC Diffenence

* means this project

Dataset DetNet(paper) DetNet(*) DetNet+IKNet(paper) DetNet+LM+AIK(*) DetNet+PSO+AIK(*)
RHD - 0.9339 0.856 0.9301 0.9310
STB 0.891 0.8744 0.898 0.8647 0.8671
DO 0.923 0.9378 0.948 0.9392 0.9342
EO 0.804 0.9270 0.811 0.9288 0.9277

Note

  • Adjusting training parameters carefully, longer training time, more complicated network or Biomechanical Constraint Losses could further boost accuracy.
  • As there is no official open source of original paper, above comparison is a little rough.

Citation

This is the unofficial pytorch reimplementation of the paper "Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data" (CVPR 2020).

If you find the project helpful, please star this project and cite them:

@inproceedings{zhou2020monocular,
  title={Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data},
  author={Zhou, Yuxiao and Habermann, Marc and Xu, Weipeng and Habibie, Ikhsanul and Theobalt, Christian and Xu, Feng},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={0--0},
  year={2020}
}

Acknowledgement

  • Code of Mano Pytorch Layer was adapted from manopth.

  • Code for evaluating the hand PCK and AUC in utils/eval/zimeval.py was adapted from hand3d.

  • Part code of data augmentation, dataset parsing and utils were adapted from bihand and 3D-Hand-Pose-Estimation.

  • Code of network model was adapted from Minimal-Hand.

  • @Mrsirovo for the starter code of the utils/LM.py, @maitetsu update it later.

  • @maitetsu for the starter code of the utils/AIK.py

Owner
Hao Meng
Master student at Beihang University , mainly interested in hand pose estimation. (LOOKING FOR RESEARCH INTERNSHIP NOW.)
Hao Meng
This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. T

iThermAI 4 Nov 27, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Exploring Visual Engagement Signals for Representation Learning

Exploring Visual Engagement Signals for Representation Learning Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim C

Menglin Jia 9 Jul 23, 2022
Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

NorCal Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation On Model Calibration for Long-Tailed Object Detec

Tai-Yu (Daniel) Pan 24 Dec 25, 2022
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 06, 2023
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Will Thompson 166 Jan 04, 2023
RepVGG: Making VGG-style ConvNets Great Again

This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge,the paper is RepVGG: Making VGG-style ConvNets Great Again

Ty Feng 62 May 21, 2022
Implementation of Continuous Sparsification, a method for pruning and ticket search in deep networks

Continuous Sparsification Implementation of Continuous Sparsification (CS), a method based on l_0 regularization to find sparse neural networks, propo

Pedro Savarese 23 Dec 07, 2022
Python Implementation of the CoronaWarnApp (CWA) Event Registration

Python implementation of the Corona-Warn-App (CWA) Event Registration This is an implementation of the Protocol used to generate event and location QR

MaZderMind 17 Oct 05, 2022
​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld A text-based game generator and extensible sandbox learning environment for training and testing reinforcement learning (RL) agents. Also ch

Microsoft 983 Dec 23, 2022
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021) Overview of paths used in DIG and IG. w is the word being attributed. The

INK Lab @ USC 17 Oct 27, 2022
Scalable machine learning based time series forecasting

mlforecast Scalable machine learning based time series forecasting. Install PyPI pip install mlforecast Optional dependencies If you want more functio

Nixtla 145 Dec 24, 2022
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

DV Lab 115 Dec 23, 2022
Lexical Substitution Framework

LexSubGen Lexical Substitution Framework This repository contains the code to reproduce the results from the paper: Arefyev Nikolay, Sheludko Boris, P

Samsung 37 Sep 15, 2022
Server files for UltimateLabeling

UltimateLabeling server files Server files for UltimateLabeling. git clone https://github.com/alexandre01/UltimateLabeling_server.git cd UltimateLabel

Alexandre Carlier 4 Oct 10, 2022
'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Across-animal odor decoding by probabilistic manifold alignment (NeurIPS 2021) This repository is the official implementation of aligned mixture of la

Pedro Herrero-Vidal 3 Jul 12, 2022
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short

Natural Language Processing @UCLA 463 Dec 09, 2022
Selfplay In MultiPlayer Environments

This project allows you to train AI agents on custom-built multiplayer environments, through self-play reinforcement learning.

200 Jan 08, 2023
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

Theis Lab 968 Dec 28, 2022