PyTorch reimplementation of minimal-hand (CVPR2020)

Overview

Minimal Hand Pytorch

Unofficial PyTorch reimplementation of minimal-hand (CVPR2020).

demo demo

you can also find in youtube or bilibili

This project reimplement following components :

  1. Training (DetNet) and Evaluation Code
  2. Shape Estimation
  3. Pose Estimation: Instead of IKNet in original paper, an analytical inverse kinematics method is used.

Offical project link: [minimal-hand]

Update

  • 2021/03/09 update about utils/LM.py, time cost drop from 12s/item to 1.57s/item

  • 2021/03/12 update about utils/LM.py, time cost drop from 1.57s/item to 0.27s/item

  • 2021/03/17 realtime perfomance is achieved when using PSO to estimate shape, coming soon

  • 2021/03/20 Add PSO to estimate shape. AUC is decreased by about 0.01 on STB and RHD datasets, and increased a little on EO and do datasets. Modifiy utlis/vis.py to improve realtime perfomance

  • 2021/03/24 Fixed some errors in calculating AUC. Update the 3D PCK AUC Diffenence.

Usage

  • Retrieve the code
git clone https://github.com/MengHao666/Minimal-Hand-pytorch
cd Minimal-Hand-pytorch
  • Create and activate the virtual environment with python dependencies
conda env create --file=environment.yml
conda activate minimal-hand-torch

Prepare MANO hand model

  1. Download MANO model from here and unzip it.

  2. Create an account by clicking Sign Up and provide your information

  3. Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license.

  4. unzip and copy the content of the models folder into the mano folder

  5. Your structure should look like this:

Minimal-Hand-pytorch/
   mano/
      models/
      webuser/

Download and Prepare datasets

Training dataset

Evaluation dataset

Processing

  • Create a data directory, extract all above datasets or additional materials in it

Now your data folder structure should like this:

data/

    CMU/
        hand143_panopticdb/
            datasets/
            ...
        hand_labels/
            datasets/
            ...

    RHD/
        RHD_published_v2/
            evaluation/
            training/
            view_sample.py
            ...

    GANeratedHands_Release/
        data/
        ...

    STB/
        images/
            B1Counting/
                SK_color_0.png
                SK_depth_0.png
                SK_depth_seg_0.png  <-- merged from STB_supp
                ...
            ...
        labels/
            B1Counting_BB.mat
            ...

    dexter+object/
        calibration/
        bbox_dexter+object.csv
        DO_pred_2d.npy
        data/
            Grasp1/
                annotations/
                    Grasp13D.txt
                    my_Grasp13D.txt
                    ...
                ...
            Grasp2/
                annotations/
                    Grasp23D.txt
                    my_Grasp23D.txt
                    ...
                ...
            Occlusion/
                annotations/
                    Occlusion3D.txt
                    my_Occlusion3D.txt
                    ...
                ...
            Pinch/
                annotations/
                    Pinch3D.txt
                    my_Pinch3D.txt
                    ...
                ...
            Rigid/
                annotations/
                    Rigid3D.txt
                    my_Rigid3D.txt
                    ...
                ...
            Rotate/
                                annotations/
                    Rotate3D.txt
                    my_Rotate3D.txt
                    ...
                ...
        

    EgoDexter/
        preview/
        data/
            Desk/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Fruits/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Kitchen/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Rotunda/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
        

Note

  • All code and data from these download falls under their own licenses.
  • DO represents "dexter+object" dataset; EO represents "EgoDexter" dataset
  • DO_supp and EO_supp are modified from original ones.
  • DO_pred_2d.npy are 2D predictions from 2D part of DetNet.
  • some labels of DO and EO is obviously wrong (u could find some examples with original labels from dexter_object.py or egodexter.py), when projected into image plane, thus should be omitted. Here come my_{}3D.txt and my_annotation.txt_3D.txt.

Download my Results

realtime demo

python demo.py

DetNet Training and Evaluation

Run the training code

python train_detnet.py --data_root data/

Run the evaluation code

python train_detnet.py --data_root data/  --datasets_test testset_name_to_test   --evaluate  --evaluate_id checkpoints_id_to_load 

or use my results

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "rhd" --evaluate  --evaluate_id 106

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "stb" --evaluate  --evaluate_id 71

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "do" --evaluate  --evaluate_id 68

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "eo" --evaluate  --evaluate_id 101

Shape Estimation

Run the shape optimization code. This can be very time consuming when the weight parameter is quite small.

python optimize_shape.py --weight 1e-5

or use my results

python optimize_shape.py --path my_results/out_testset/

Pose Estimation

Run the following code which uses a analytical inverse kinematics method.

python aik_pose.py

or use my results

python aik_pose.py --path my_results/out_testset/

Detnet training and evaluation curve

Run the following code to see my results

python plot.py --path my_results/out_loss_auc

(AUC means 3D PCK, and ACC_HM means 2D PCK) teaser

3D PCK AUC Diffenence

* means this project

Dataset DetNet(paper) DetNet(*) DetNet+IKNet(paper) DetNet+LM+AIK(*) DetNet+PSO+AIK(*)
RHD - 0.9339 0.856 0.9301 0.9310
STB 0.891 0.8744 0.898 0.8647 0.8671
DO 0.923 0.9378 0.948 0.9392 0.9342
EO 0.804 0.9270 0.811 0.9288 0.9277

Note

  • Adjusting training parameters carefully, longer training time, more complicated network or Biomechanical Constraint Losses could further boost accuracy.
  • As there is no official open source of original paper, above comparison is a little rough.

Citation

This is the unofficial pytorch reimplementation of the paper "Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data" (CVPR 2020).

If you find the project helpful, please star this project and cite them:

@inproceedings{zhou2020monocular,
  title={Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data},
  author={Zhou, Yuxiao and Habermann, Marc and Xu, Weipeng and Habibie, Ikhsanul and Theobalt, Christian and Xu, Feng},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={0--0},
  year={2020}
}

Acknowledgement

  • Code of Mano Pytorch Layer was adapted from manopth.

  • Code for evaluating the hand PCK and AUC in utils/eval/zimeval.py was adapted from hand3d.

  • Part code of data augmentation, dataset parsing and utils were adapted from bihand and 3D-Hand-Pose-Estimation.

  • Code of network model was adapted from Minimal-Hand.

  • @Mrsirovo for the starter code of the utils/LM.py, @maitetsu update it later.

  • @maitetsu for the starter code of the utils/AIK.py

Owner
Hao Meng
Master student at Beihang University , mainly interested in hand pose estimation. (LOOKING FOR RESEARCH INTERNSHIP NOW.)
Hao Meng
Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Official PyTorch code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction. Guanglei Yang, Hao Tang, Mingli Ding, Nicu Sebe,

stanley 152 Dec 16, 2022
naked is a Python tool which allows you to strip a model and only keep what matters for making predictions.

naked is a Python tool which allows you to strip a model and only keep what matters for making predictions. The result is a pure Python function with no third-party dependencies that you can simply c

Max Halford 24 Dec 20, 2022
Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

PVDNet: Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes This repository contains the official PyTorch implementatio

Junyong Lee 98 Nov 06, 2022
All supplementary material used by me while TA-ing CS3244: Machine Learning

CS3244-Tutorial-Material All supplementary material used by me while TA-ing CS3244: Machine Learning at NUS School of Computing. What is this? I teach

Rishabh Anand 18 Sep 23, 2022
Tensorflow implementation of soft-attention mechanism for video caption generation.

SA-tensorflow Tensorflow implementation of soft-attention mechanism for video caption generation. An example of soft-attention mechanism. The attentio

Paul Chen 153 Nov 14, 2022
Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Occlusion Robust 3D face Reconstruction Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, and Seong-Whan Lee Code for Occlusion Robust 3D Face Reconstruction

Yeongjoon 31 Dec 19, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out

2 Jan 26, 2022
The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

PISE The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp. Requirement conda create -n pise pyt

jinszhang 110 Nov 21, 2022
Pytorch Implementation for Dilated Continuous Random Field

DilatedCRF Pytorch implementation for fully-learnable DilatedCRF. If you find my work helpful, please consider our paper: @article{Mo2022dilatedcrf,

DunnoCoding_Plus 3 Nov 13, 2022
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology Self-Supervised Vision Transformers Learn Visual Concepts in Histopatholog

Richard Chen 95 Dec 24, 2022
Code for our paper 'Generalized Category Discovery'

Generalized Category Discovery This repo is a placeholder for code for our paper: Generalized Category Discovery Abstract: In this paper, we consider

107 Dec 28, 2022
"Learning Free Gait Transition for Quadruped Robots vis Phase-Guided Controller"

PhaseGuidedControl The current version is developed based on the old version of RaiSim series, and possibly requires further modification. It will be

X-Mechanics 12 Oct 21, 2022
The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

Surge AI 134 Dec 19, 2022
A simple log parser and summariser for IIS web server logs

IISLogFileParser A basic parser tool for IIS Logs which summarises findings from the log file. Inspired by the Gist https://gist.github.com/wh13371/e7

2 Mar 26, 2022
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s

14 Nov 10, 2022
Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Inter-Prototype (BMVC 2021): Official Project Webpage This repository provides the official PyTorch implementation of the following paper: Improving F

Jungsoo Lee 16 Jun 30, 2022
A library that allows for inference on probabilistic models

Bean Machine Overview Bean Machine is a probabilistic programming language for inference over statistical models written in the Python language using

Meta Research 234 Dec 29, 2022
Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

Necati Cihan Camgoz 164 Dec 30, 2022
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022