Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

Overview

HanCo Dataset & Contrastive Representation Learning for Hand Shape Estimation

Code in conjunction with the publication: Contrastive Representation Learning for Hand Shape Estimation.

This repository contains code for inference of both networks: The one obtained from self-supervised contrastive pre-training and the network trained supervisedly for hand pose estimation. Additionally, we provide examples how to work with the HanCo dataset and release the pytorch Dataset that was used during our pre-training experiments. This dataset is an extension of the FreiHand dataset.

Visit our project page for additional information.

Requirements

Python environment

conda create -n contra-hand python=3.6
conda activate contra-hand
conda install -c pytorch pytorch=1.6.0 torchvision cudatoolkit=10.2
conda install -c conda-forge -c fvcore fvcore transforms3d
pip install pytorch3d transforms3d tqdm pytorch-lightning imgaug open3d matplotlib
pip install git+https://github.com/hassony2/chumpy.git

Hand Pose Dataset

You either need the full HanCo dataset or the small tester data sample (recommended).

Random Background Images

As the hand pose dataset contains green screen images, randomized backgrounds can be used. For our dataset we used 2195 images from Flickr. As these were not all licensed in a permissive manner, we provide a set of background images to use with the dataset. These can be found here.

MANO model

Our supervised training code uses the MANO Hand model, which you need to aquire seperately due to licensing regulations: https://mano.is.tue.mpg.de

In order for our code to work fine copy MANO_RIGHT.pkl from the MANO website to contra-hand/mano_models/MANO_RIGHT.pkl.

We also build on to of the great PyTorch implementation of MANO provided by Yana Hasson et al., which was modified by us and is already contained in this repository.

Trained models

We release both the MoCo pretrained model and the shape estimation network that was derived from it.

In order to get the trained models download and unpack them locally:

curl https://lmb.informatik.uni-freiburg.de/data/HanCo/contra-hand-ckpt.zip -o contra-hand-ckpt.zip & unzip contra-hand-ckpt.zip 

Code

This repository contains scripts that facilitate using the HanCo dataset and building on the results from our publication.

Show dataset

You will need to download the HanCo dataset (or at least the tester). This script gives you some examples on how to work with the dataset.

python show_dataset.py <Path-To-Your-Local-HanCo-Directory>

Use our MoCo trained model

There is a simple script that calculates the cosine similarity score for two hard coded examples:

python run_moco_fw.py

There is the script we used to create the respective figure in our paper.

python run_moco_qualitative_embedding.py

Self-Supervised Training with MoCo

We provide a torch data loader that can be used as a drop-in replacement for MoCo training. The data loader can be found here DatasetUnsupervisedMV.py. It has boolean options that control how the data is provided, these are cross_bg, cross_camera, and cross_time. The get_dataset function also shows the pre-processing that we use, which is slightly different from the standard MoCo pre-processing.

Use our MANO prediction model

The following script allows to run inference on an example image:

run_hand_shape_fw.py <Path-To-Your-Local-HanCo-Directory>
Owner
Computer Vision Group, Albert-Ludwigs-Universität Freiburg
Pattern Recognition and Image Processing
Computer Vision Group, Albert-Ludwigs-Universität Freiburg
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 06, 2022
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

MLP-Mixer: An all-MLP Architecture for Vision This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision. Usage : impo

Rishikesh (ऋषिकेश) 175 Dec 23, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Jan 04, 2023
Discord bot-CTFD-Thread-Parser - Discord bot CTFD-Thread-Parser

Discord bot CTFD-Thread-Parser Description: This tools is used to create automat

15 Mar 22, 2022
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 05, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 08, 2023
Deep Ensemble Learning with Jet-Like architecture

Ransomware analysis using DEL with jet-like architecture comprising two CNN wings, a sparse AE tail, a non-linear PCA to produce a diverse feature space, and an MLP nose

Ahsen Nazir 2 Feb 06, 2022
Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Clara Meister 50 Nov 12, 2022
PiRapGenerator - Make anyone rap the digits of pi

PiRapGenerator Make anyone rap the digits of pi (sample files are of Ted Nivison

7 Oct 02, 2022
This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

LiDARTag Overview This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds (PDF)(arXiv). This wo

University of Michigan Dynamic Legged Locomotion Robotics Lab 159 Dec 21, 2022
Stacked Recurrent Hourglass Network for Stereo Matching

SRH-Net: Stacked Recurrent Hourglass Introduction This repository is supplementary material of our RA-L submission, which helps reviewers to understan

28 Jan 03, 2023
Python implementation of Project Fluent

Project Fluent This is a collection of Python packages to use the Fluent localization system. python-fluent consists of these packages: fluent.syntax

Project Fluent 155 Dec 28, 2022
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

AdaptationSeg This is the Python reference implementation of AdaptionSeg proposed in "Curriculum Domain Adaptation for Semantic Segmentation of Urban

Yang Zhang 128 Oct 19, 2022
MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

MemStream Implementation of MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift . Siddharth Bhatia, Arjit Jain, Shivi

Stream-AD 61 Dec 02, 2022
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark 🔥 🔥 Based on MMsegmentation 🔥 🔥 Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 08, 2023
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

Arthur Bražinskas 39 Jan 01, 2023
Implementation of UNet on the Joey ML framework

Independent Research Project - Code Joey can be cloned from here https://github.com/devitocodes/joey/. Devito and other dependencies such as PyTorch a

Navjot Kukreja 1 Oct 21, 2021
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)

KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p

Yu-Kai Lin 109 Dec 14, 2022