Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

Last update: Dec 13, 2022

Related tags

Overview

HanCo Dataset & Contrastive Representation Learning for Hand Shape Estimation

Code in conjunction with the publication: Contrastive Representation Learning for Hand Shape Estimation.

This repository contains code for inference of both networks: The one obtained from self-supervised contrastive pre-training and the network trained supervisedly for hand pose estimation. Additionally, we provide examples how to work with the HanCo dataset and release the pytorch Dataset that was used during our pre-training experiments. This dataset is an extension of the FreiHand dataset.

Visit our project page for additional information.

Requirements

Python environment

conda create -n contra-hand python=3.6
conda activate contra-hand
conda install -c pytorch pytorch=1.6.0 torchvision cudatoolkit=10.2
conda install -c conda-forge -c fvcore fvcore transforms3d
pip install pytorch3d transforms3d tqdm pytorch-lightning imgaug open3d matplotlib
pip install git+https://github.com/hassony2/chumpy.git

Hand Pose Dataset

You either need the full HanCo dataset or the small tester data sample (recommended).

Random Background Images

As the hand pose dataset contains green screen images, randomized backgrounds can be used. For our dataset we used 2195 images from Flickr. As these were not all licensed in a permissive manner, we provide a set of background images to use with the dataset. These can be found here.

MANO model

Our supervised training code uses the MANO Hand model, which you need to aquire seperately due to licensing regulations: https://mano.is.tue.mpg.de

In order for our code to work fine copy MANO_RIGHT.pkl from the MANO website to contra-hand/mano_models/MANO_RIGHT.pkl.

We also build on to of the great PyTorch implementation of MANO provided by Yana Hasson et al., which was modified by us and is already contained in this repository.

Trained models

We release both the MoCo pretrained model and the shape estimation network that was derived from it.

In order to get the trained models download and unpack them locally:

curl https://lmb.informatik.uni-freiburg.de/data/HanCo/contra-hand-ckpt.zip -o contra-hand-ckpt.zip & unzip contra-hand-ckpt.zip

Code

This repository contains scripts that facilitate using the HanCo dataset and building on the results from our publication.

Show dataset

You will need to download the HanCo dataset (or at least the tester). This script gives you some examples on how to work with the dataset.

python show_dataset.py <Path-To-Your-Local-HanCo-Directory>

Use our MoCo trained model

There is a simple script that calculates the cosine similarity score for two hard coded examples:

python run_moco_fw.py

There is the script we used to create the respective figure in our paper.

python run_moco_qualitative_embedding.py

Self-Supervised Training with MoCo

We provide a torch data loader that can be used as a drop-in replacement for MoCo training. The data loader can be found here DatasetUnsupervisedMV.py. It has boolean options that control how the data is provided, these are cross_bg, cross_camera, and cross_time. The get_dataset function also shows the pre-processing that we use, which is slightly different from the standard MoCo pre-processing.

Use our MANO prediction model

The following script allows to run inference on an example image:

run_hand_shape_fw.py <Path-To-Your-Local-HanCo-Directory>

Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

Related tags

Overview

HanCo Dataset & Contrastive Representation Learning for Hand Shape Estimation

Requirements

Python environment

Hand Pose Dataset

Random Background Images

MANO model

Trained models

Code

Show dataset

Use our MoCo trained model

Self-Supervised Training with MoCo

Use our MANO prediction model

Owner

Computer Vision Group, Albert-Ludwigs-Universität Freiburg

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Discord bot-CTFD-Thread-Parser - Discord bot CTFD-Thread-Parser

"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Deep Ensemble Learning with Jet-Like architecture

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

PiRapGenerator - Make anyone rap the digits of pi

This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

Stacked Recurrent Hourglass Network for Stereo Matching

Python implementation of Project Fluent

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

Lite-HRNet: A Lightweight High-Resolution Network

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Implementation of UNet on the Joey ML framework

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)