Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Overview

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild

Akash Sengupta, Ignas Budvytis, Roberto Cipolla
ICCV 2021
[paper+supplementary][poster][results video]

This is the official code repository of the above paper, which takes a probabilistic approach to 3D human shape and pose estimation and predicts multiple plausible 3D reconstruction samples given an input image.

teaser

This repository contains inference, training (TODO) and evaluation (TODO) code. A few weaknesses of this approach, and future research directions, are listed below (TODO). If you find this code useful in your research, please cite the following publication:

@InProceedings{sengupta2021hierprobhuman,
               author = {Sengupta, Akash and Budvytis, Ignas and Cipolla, Roberto},
               title = {{Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild}},
               booktitle = {International Conference on Computer Vision},
               month = {October},
               year = {2021}                         
}

Installation

Requirements

  • Linux or macOS
  • Python ≥ 3.6

Instructions

We recommend using a virtual environment to install relevant dependencies:

python3 -m venv HierProbHuman
source HierProbHuman/bin/activate

Install torch and torchvision (the code has been tested with v1.6.0 of torch), as well as other dependencies:

pip install torch==1.6.0 torchvision==0.7.0
pip install -r requirements.txt

Finally, install pytorch3d, which we use for data generation during training and visualisation during inference. To do so, you will need to first install the CUB library following the instructions here. Then you may install pytorch3d - note that the code has been tested with v0.3.0 of pytorch3d, and we recommend installing this version using:

pip install "git+https://github.com/facebookresearch/[email protected]"

Model files

You will need to download the SMPL model. The neutral model is required for training and running the demo code. If you want to evaluate the model on datasets with gendered SMPL labels (such as 3DPW and SSP-3D), the male and female models are available here. You will need to convert the SMPL model files to be compatible with python3 by removing any chumpy objects. To do so, please follow the instructions here.

Download pre-trained model checkpoints for our 3D Shape/Pose network, as well as for 2D Pose HRNet-W48 from here.

Place the SMPL model files and network checkpoints in the model_files directory, which should have the following structure. If the files are placed elsewhere, you will need to update configs/paths.py accordingly.

HierarchicalProbabilistic3DHuman
├── model_files                                  # Folder with model files
│   ├── smpl
│   │   ├── SMPL_NEUTRAL.pkl                     # Gender-neutral SMPL model
│   │   ├── SMPL_MALE.pkl                        # Male SMPL model
│   │   ├── SMPL_FEMALE.pkl                      # Female SMPL model
│   ├── poseMF_shapeGaussian_net_weights.tar     # Pose/Shape distribution predictor checkpoint
│   ├── pose_hrnet_w48_384x288.pth               # Pose2D HRNet checkpoint
│   ├── cocoplus_regressor.npy                   # Cocoplus joints regressor
│   ├── J_regressor_h36m.npy                     # Human3.6M joints regressor
│   ├── J_regressor_extra.npy                    # Extra joints regressor
│   └── UV_Processed.mat                         # DensePose UV coordinates for SMPL mesh             
└── ...

Inference

run_predict.py is used to run inference on a given folder of input images. For example, to run inference on the demo folder, do:

python run_predict.py --image_dir ./demo/ --save_dir ./output/ --visualise_samples --visualise_uncropped

This will first detect human bounding boxes in the input images using Mask-RCNN. If your input images are already cropped and centred around the subject of interest, you may skip this step using --cropped_images as an option. The 3D Shape/Pose network is somewhat sensitive to cropping and centering - this is a good place to start troubleshooting in case of poor results.

Inference can be slow due to the rejection sampling procedure used to estimate per-vertex 3D uncertainty. If you are not interested in per-vertex uncertainty, you may modify predict/predict_poseMF_shapeGaussian_net.py by commenting out code related to sampling, and use a plain texture to render meshes for visualisation (this will be cleaned up and added as an option to in the run_predict.py future).

TODO

  • Training Code
  • Evaluation Code for 3DPW and SSP-3D
  • Gendered pre-trained models for improved shape estimation
  • Weaknesses and future research

Acknowledgments

Code was adapted from/influenced by the following repos - thanks to the authors!

Owner
Akash Sengupta
Akash Sengupta
Unofficial PyTorch implementation of the Adaptive Convolution architecture for image style transfer

AdaConv Unofficial PyTorch implementation of the Adaptive Convolution architecture for image style transfer from "Adaptive Convolutions for Structure-

65 Dec 22, 2022
This is a official repository of SimViT.

SimViT This is a official repository of SimViT. We will open our models and codes about object detection and semantic segmentation soon. Our code refe

ligang 57 Dec 15, 2022
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

Muzammal Naseer 47 Dec 02, 2022
Revisting Open World Object Detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our new data division is based on COCO2017. We divide the training set into

58 Dec 23, 2022
DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

DeepLab Introduction DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe. It combines densely-compute

Ali 234 Nov 14, 2022
Python Library for Signal/Image Data Analysis with Transport Methods

PyTransKit Python Transport Based Signal Processing Toolkit Website and documentation: https://pytranskit.readthedocs.io/ Installation The library cou

24 Dec 23, 2022
Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.

CaiT-TF (Going deeper with Image Transformers) This repository provides TensorFlow / Keras implementations of different CaiT [1] variants from Touvron

Sayak Paul 9 Jun 26, 2022
Pytorch implementation of FlowNet by Dosovitskiy et al.

FlowNetPytorch Pytorch implementation of FlowNet by Dosovitskiy et al. This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et

Clément Pinard 762 Jan 02, 2023
(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Int

CVMI Lab 228 Dec 25, 2022
Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

ElegantRL “小雅”: Scalable and Elastic Deep Reinforcement Learning ElegantRL is developed for researchers and practitioners with the following advantage

AI4Finance Foundation 2.5k Jan 05, 2023
Yolov3 pytorch implementation

YOLOV3 Pytorch实现 在bubbliiing大佬代码的基础上进行了修改,添加了部分注释。 预训练模型 预训练模型来源于bubbliiing。 链接:https://pan.baidu.com/s/1ncREw6Na9ycZptdxiVMApw 提取码:appk 训练自己的数据集 按照VO

4 Aug 27, 2022
Hide screen when boss is approaching.

BossSensor Hide your screen when your boss is approaching. Demo The boss stands up. He is approaching. When he is approaching, the program fetches fac

Hiroki Nakayama 6.2k Jan 07, 2023
Repo 4 basic seminar §How to make human machine readable"

WORK IN PROGRESS... Notebooks from the Seminar: Human Machine Readable WS21/22 Introduction into programming Georg Trogemann, Christian Heck, Mattis

experimental-informatics 3 May 29, 2022
Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

41 Jan 04, 2023
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

AI-OpenMic Dataset The dataset is available for download via the follwing link. Repository for code and dataset for our EMNLP 2021 paper - “So You Thi

6 Oct 26, 2022
Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco

Gty 44 Dec 17, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
A keras-based real-time model for medical image segmentation (CFPNet-M)

CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation This repository contains the implementat

268 Nov 27, 2022
Residual Dense Net De-Interlace Filter (RDNDIF)

Residual Dense Net De-Interlace Filter (RDNDIF) Work in progress deep de-interlacer filter. It is based on the architecture proposed by Bernasconi et

Louis 7 Feb 15, 2022