Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19 (Oral).

Overview

Pose-Transfer

Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19(Oral). The paper is available here.

Video generation with a single image as input. More details can be found in the supplementary materials in our paper.

News

  • We have released a new branch PATN_Fine. We introduce a segment-based skip-connection and a novel segment-based style loss, achieving even better results on DeepFashion.
  • Video demo is available now. We further improve the performance of our model by introducing a segment-based skip-connection. We will release the code soon. Refer to our supplementary materials for more details.
  • Codes for pytorch 1.0 is available now under the branch pytorch_v1.0. The same results on both datasets can be reproduced with the pretrained model.

Notes:

In pytorch 1.0, running_mean and running_var are not saved for the Instance Normalization layer by default. To reproduce our result in the paper, launch python tool/rm_insnorm_running_vars.py to remove corresponding keys in the pretrained model. (Only for the DeepFashion dataset.)

This is Pytorch implementation for pose transfer on both Market1501 and DeepFashion dataset. The code is written by Tengteng Huang and Zhen Zhu.

Requirement

  • pytorch(0.3.1)
  • torchvision(0.2.0)
  • numpy
  • scipy
  • scikit-image
  • pillow
  • pandas
  • tqdm
  • dominate

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/tengteng95/Pose-Transfer.git
cd Pose-Transfer

Data Preperation

We provide our dataset split files and extracted keypoints files for convience.

Market1501

  • Download the Market-1501 dataset from here. Rename bounding_box_train and bounding_box_test to train and test, and put them under the market_data directory.
  • Download train/test splits and train/test key points annotations from Google Drive or Baidu Disk, including market-pairs-train.csv, market-pairs-test.csv, market-annotation-train.csv, market-annotation-train.csv. Put these four files under the market_data directory.
  • Generate the pose heatmaps. Launch
python tool/generate_pose_map_market.py

DeepFashion

Note: In our settings, we crop the images of DeepFashion into the resolution of 176x256 in a center-crop manner.

python tool/generate_fashion_datasets.py
  • Download train/test pairs and train/test key points annotations from Google Drive or Baidu Disk, including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv. Put these four files under the fashion_data directory.
  • Generate the pose heatmaps. Launch
python tool/generate_pose_map_fashion.py

Notes:

Optionally, you can also generate these files by yourself.

  1. Keypoints files

We use OpenPose to generate keypoints.

  • Download pose estimator from Google Drive or Baidu Disk. Put it under the root folder Pose-Transfer.
  • Change the paths input_folder and output_path in tool/compute_coordinates.py. And then launch
python2 compute_coordinates.py
  1. Dataset split files
python2 tool/create_pairs_dataset.py

Train a model

Market-1501

python train.py --dataroot ./market_data/ --name market_PATN --model PATN --lambda_GAN 5 --lambda_A 10  --lambda_B 10 --dataset_mode keypoint --no_lsgan --n_layers 3 --norm batch --batchSize 32 --resize_or_crop no --gpu_ids 0 --BP_input_nc 18 --no_flip --which_model_netG PATN --niter 500 --niter_decay 200 --checkpoints_dir ./checkpoints --pairLst ./market_data/market-pairs-train.csv --L1_type l1_plus_perL1 --n_layers_D 3 --with_D_PP 1 --with_D_PB 1  --display_id 0

DeepFashion

python train.py --dataroot ./fashion_data/ --name fashion_PATN --model PATN --lambda_GAN 5 --lambda_A 1 --lambda_B 1 --dataset_mode keypoint --n_layers 3 --norm instance --batchSize 7 --pool_size 0 --resize_or_crop no --gpu_ids 0 --BP_input_nc 18 --no_flip --which_model_netG PATN --niter 500 --niter_decay 200 --checkpoints_dir ./checkpoints --pairLst ./fashion_data/fasion-resize-pairs-train.csv --L1_type l1_plus_perL1 --n_layers_D 3 --with_D_PP 1 --with_D_PB 1  --display_id 0

Test the model

Market1501

python test.py --dataroot ./market_data/ --name market_PATN --model PATN --phase test --dataset_mode keypoint --norm batch --batchSize 1 --resize_or_crop no --gpu_ids 2 --BP_input_nc 18 --no_flip --which_model_netG PATN --checkpoints_dir ./checkpoints --pairLst ./market_data/market-pairs-test.csv --which_epoch latest --results_dir ./results --display_id 0

DeepFashion

python test.py --dataroot ./fashion_data/ --name fashion_PATN --model PATN --phase test --dataset_mode keypoint --norm instance --batchSize 1 --resize_or_crop no --gpu_ids 0 --BP_input_nc 18 --no_flip --which_model_netG PATN --checkpoints_dir ./checkpoints --pairLst ./fashion_data/fasion-resize-pairs-test.csv --which_epoch latest --results_dir ./results --display_id 0

Evaluation

We adopt SSIM, mask-SSIM, IS, mask-IS, DS, and PCKh for evaluation of Market-1501. SSIM, IS, DS, PCKh for DeepFashion.

1) SSIM and mask-SSIM, IS and mask-IS, mask-SSIM

For evaluation, Tensorflow 1.4.1(python3) is required. Please see requirements_tf.txt for details.

For Market-1501:

python tool/getMetrics_market.py

For DeepFashion:

python tool/getMetrics_market.py

If you still have problems for evaluation, please consider using docker.

docker run -v <Pose-Transfer path>:/tmp -w /tmp --runtime=nvidia -it --rm tensorflow/tensorflow:1.4.1-gpu-py3 bash
# now in docker:
$ pip install scikit-image tqdm 
$ python tool/getMetrics_market.py

Refer to this Issue.

2) DS Score

Download pretrained on VOC 300x300 model and install propper caffe version SSD. Put it in the ssd_score forlder.

For Market-1501:

python compute_ssd_score_market.py --input_dir path/to/generated/images

For DeepFashion:

python compute_ssd_score_fashion.py --input_dir path/to/generated/images

3) PCKh

  • First, run tool/crop_market.py or tool/crop_fashion.py.
  • Download pose estimator from Google Drive or Baidu Disk. Put it under the root folder Pose-Transfer.
  • Change the paths input_folder and output_path in tool/compute_coordinates.py. And then launch
python2 compute_coordinates.py
  • run tool/calPCKH_fashion.py or tool/calPCKH_market.py

Pre-trained model

Our pre-trained model can be downloaded Google Drive or Baidu Disk.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{zhu2019progressive,
  title={Progressive Pose Attention Transfer for Person Image Generation},
  author={Zhu, Zhen and Huang, Tengteng and Shi, Baoguang and Yu, Miao and Wang, Bofei and Bai, Xiang},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={2347--2356},
  year={2019}
}

Acknowledgments

Our code is based on the popular pytorch-CycleGAN-and-pix2pix.

Owner
Tengteng Huang
Tengteng Huang
Repository for Driving Style Recognition algorithms for Autonomous Vehicles

Driving Style Recognition Using Interval Type-2 Fuzzy Inference System and Multiple Experts Decision Making Created by Iago Pachêco Gomes at USP - ICM

Iago Gomes 9 Nov 28, 2022
这是一个facenet-pytorch的库,可以用于训练自己的人脸识别模型。

Facenet:人脸识别模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 预测步骤 How2predict 训练步骤 How2train 参考资料 Reference 性能情况 训练数据

Bubbliiiing 210 Jan 06, 2023
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

Sungyoon Lee 4 Jul 12, 2022
This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

GANs for Fun Created because I can! GOAL The goal of this repo is to be freely used by ML devs to check the GAN performances without coding from scrat

Sagnik Roy 13 Jan 26, 2022
Introduction to CPM

CPM CPM is an open-source program on large-scale pre-trained models, which is conducted by Beijing Academy of Artificial Intelligence and Tsinghua Uni

Tsinghua AI 136 Dec 23, 2022
Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

unfoldedVBA Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution This repository contains the Pytorch implementation of the unrolled

Yunshi HUANG 2 Jul 10, 2022
Differential fuzzing for the masses!

NEZHA NEZHA is an efficient and domain-independent differential fuzzer developed at Columbia University. NEZHA exploits the behavioral asymmetries bet

147 Dec 05, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

PySlowFast PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficie

Meta Research 5.3k Jan 03, 2023
Confident Semantic Ranking Loss for Part Parsing

Confident Semantic Ranking Loss for Part Parsing

Jiachen Xu 5 Oct 22, 2022
Low Complexity Channel estimation with Neural Network Solutions

Interpolation-ResNet Invited paper for WSA 2021, called 'Low Complexity Channel estimation with Neural Network Solutions'. Low complexity residual con

Dianxin 10 Dec 10, 2022
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or

Healthcare Intelligence Laboratory 71 Jan 07, 2023
Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion

CSF Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion Tips: For testing: CUDA_VISIBLE_DEVICES=0 python main.py For trai

Han Xu 14 Oct 31, 2022
Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form This is supplementary code for the TISMIR paper Sliding-Window Pitch-Class H

1 Nov 27, 2021
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
A2LP for short, ECCV2020 spotlight, Investigating SSL principles for UDA problems

Label-Propagation-with-Augmented-Anchors (A2LP) Official codes of the ECCV2020 spotlight (label propagation with augmented anchors: a simple semi-supe

20 Oct 27, 2022
Dual Attention Network for Scene Segmentation (CVPR2019)

Dual Attention Network for Scene Segmentation(CVPR2019) Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang,and Hanqing Lu Introduction W

Jun Fu 2.2k Dec 28, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 07, 2022