[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Overview

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Getting Started

Our codes are implemented and tested with python 3.6 and pytorch 1.5.

Install Pytorch following the official guide on Pytorch website.

And install the requirements using virtualenv or conda:

pip install -r requirements.txt

Data Preparation

Refer to data.md for instructions.

Training

Stage 1 training

Generally, you can use the distributed launch script of pytorch to start training.

For example, for a training on 2 nodes, 4 gpus each (2x4=8 gpus total): On node 0, run:

python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=0 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage1.yaml

On node 1, run:

python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=1 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage1.yaml

Otherwise, if you are using task scheduling system such as Slurm to submit your training tasks, you can refer to this script to start your training:

# training on 2 nodes, 4 gpus each (2x4=8 gpus total)
sh scripts/run.sh 2 4 configs/config_stage1.yaml

The checkpoint of training will be saved in [results/] by default. You are free to modify it in the config file.

Stage 2 training

Use the last checkpoint of stage 1 to initialize the model and starts training stage 2.

# On Node 0.
python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=0 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage2.yaml --pretrained <PATH_TO_CHECKPOINT_FILE>

Similar on node 1.

Evaluation

To evaluate model on 3dpw test set:

python eval.py --cfg <PATH_TO_EXPERIMENT>/config.yaml --checkpoint <PATH_TO_EXPERIMENT>/model_best.pth.tar --eval_set 3dpw

Evaluation metric is Procrustes Aligned Mean Per Joint Position Error (PA-MPJPE) in mm.

Models PA-MPJPE ↓ MPJPE ↓ PVE ↓ ACCEL ↓
HMR (w/o 3DPW) 81.3 130.0 - 37.4
SPIN (w/o 3DPW) 59.2 96.9 116.4 29.8
MEVA (w/ 3DPW) 54.7 86.9 - 11.6
VIBE (w/o 3DPW) 56.5 93.5 113.4 27.1
VIBE (w/ 3DPW) 51.9 82.9 99.1 23.4
ours (w/o 3DPW) 50.7 88.8 104.5 18.0
ours (w/ 3DPW) 45.7 79.1 92.6 17.6

Citation

@inproceedings{wan2021,
  title={Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation},
  author={Ziniu Wan, Zhengjia Li, Maoqing Tian, Jianbo Liu, Shuai Yi, Hongsheng Li},
  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
  year = {2021}
}
Owner
TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

ICNet_tensorflow This repo provides a TensorFlow-based implementation of paper "ICNet for Real-Time Semantic Segmentation on High-Resolution Images,"

HsuanKung Yang 406 Nov 27, 2022
A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks)

A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks) This repository contains a PyTorch implementation for the paper: Deep Pyra

Greg Dongyoon Han 262 Jan 03, 2023
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
Multi-Scale Progressive Fusion Network for Single Image Deraining

Multi-Scale Progressive Fusion Network for Single Image Deraining (MSPFN) This is an implementation of the MSPFN model proposed in the paper (Multi-Sc

Kuijiang 128 Nov 21, 2022
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 06, 2022
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking (SDR) via Contextualized Language Models and Hierarchical Inference This repo is the implementation for SD

Microsoft 36 Nov 28, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University 326 Dec 13, 2022
A library for Deep Learning Implementations and utils

deeply A Deep Learning library Table of Contents Features Quick Start Usage License Features Python 2.7+ and Python 3.4+ compatible. Quick Start $ pip

Achilles Rasquinha 1 Dec 12, 2022
Adversarial examples to the new ConvNeXt architecture

Adversarial examples to the new ConvNeXt architecture To get adversarial examples to the ConvNeXt architecture, run the Colab: https://github.com/stan

Stanislav Fort 19 Sep 18, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
🔮 Execution time predictions for deep neural network training iterations across different GPUs.

Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training Habitat is a tool that predicts a deep neural network's

Geoffrey Yu 44 Dec 27, 2022
Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data

Ayush Daksh 12 Dec 01, 2022
Synthesizing and manipulating 2048x1024 images with conditional GANs

pix2pixHD Project | Youtube | Paper Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translatio

NVIDIA Corporation 6k Dec 27, 2022
KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

KAPAO (Keypoints and Poses as Objects) KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as

Will McNally 664 Dec 30, 2022
World Models with TensorFlow 2

World Models This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2. Docker The easiest way to hand

Zac Wellmer 234 Nov 30, 2022
Unofficial Implementation of MLP-Mixer, gMLP, resMLP, Vision Permutator, S2MLPv2, RaftMLP, ConvMLP, ConvMixer in Jittor and PyTorch.

Unofficial Implementation of MLP-Mixer, gMLP, resMLP, Vision Permutator, S2MLPv2, RaftMLP, ConvMLP, ConvMixer in Jittor and PyTorch! Now, Rearrange and Reduce in einops.layers.jittor are support!!

130 Jan 08, 2023
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

Korbinian Pöppel 47 Nov 28, 2022
Learning to Draw: Emergent Communication through Sketching

Learning to Draw: Emergent Communication through Sketching This is the official code for the paper "Learning to Draw: Emergent Communication through S

19 Jul 22, 2022
vit for few-shot classification

Few-Shot ViT Requirements PyTorch (= 1.9) TorchVision timm (latest) einops tqdm numpy scikit-learn scipy argparse tensorboardx Pretrained Checkpoints

Martin Dong 26 Nov 30, 2022