[SIGGRAPH Asia 2021] Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

Overview

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

[Paper] [Project Website] [Output resutls]

Official Pytorch implementation for Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN. Please contact Badour AlBahar ([email protected]) if you have any questions.

Requirements

conda create -n posewithstyle python=3.6
conda activate posewithstyle
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Intall openCV using conda install -c conda-forge opencv or pip install opencv-python. If you would like to use wandb, install it using pip install wandb.

Download pretrained models

You can download the pretrained model here, and the pretrained coordinate completion model here.

Note: we also provide the pretrained model trained on StylePoseGAN [Sarkar et al. 2021] DeepFashion train/test split here. We also provide this split's pretrained coordinate completion model here.

Reposing

Download the UV space - 2D look up map and save it in util folder.

We provide sample data in data directory. The output will be saved in data/output directory.

python inference.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

To repose your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the target dense pose (target_name+'_iuv.png') in data directory.

python inference.py --input_path ./data --input_name fashionWOMENDressesid0000262902_3back --target_name fashionWOMENDressesid0000262902_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

Garment transfer

Download the UV space - 2D look up map and the UV space body part segmentation. Save both in util folder. The UV space body part segmentation will provide a generic segmentation of the human body. Alternatively, you can specify your own mask of the region you want to transfer.

We provide sample data in data directory. The output will be saved in data/output directory.

python garment_transfer.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

To use your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the garment source target image (target_name+'.png'), dense pose (target_name+'_iuv.png'), and silhouette (target_name+'_sil.png') in data directory. You can specify the part to be transferred using --part as upper_body, lower_body, or face. The output as well as the part transferred (shown in red) will be saved in data/output directory.

python garment_transfer.py --input_path ./data --input_name fashionWOMENSkirtsid0000177102_1front --target_name fashionWOMENBlouses_Shirtsid0000635004_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

DeepFashion Dataset

To train or test, you must download and process the dataset. Please follow instructions in Dataset and Downloads.

You should have the following downloaded in your DATASET folder:

DATASET/DeepFashion_highres
 - train
 - test
 - tools
   - train.lst
   - test.lst
   - fashion-pairs-train.csv
   - fashion-pairs-test.csv

DATASET/densepose
 - train
 - test

DATASET/silhouette
 - train
 - test

DATASET/partial_coordinates
 - train
 - test

DATASET/complete_coordinates
 - train
 - test

DATASET/resources
 - train_face_T.pickle
 - sphere20a_20171020.pth

Training

Step 1: First, train the reposing model by focusing on generating the foreground. We set the batch size to 1 and train for 50 epochs. This training process takes around 7 days on 8 NVIDIA 2080 Ti GPUs.

python -m torch.distributed.launch --nproc_per_node=8 --master_port XXXX train.py --batch 1 /path/to/DATASET --name exp_name_step1 --size 512 --faceloss --epoch 50

The checkpoints will be saved in checkpoint/exp_name.

Step 2: Then, finetune the model by training on the entire image (only masking the padded boundary). We set the batch size to 8 and train for 10 epochs. This training process takes less than 2 days on 2 A100 GPUs.

python -m torch.distributed.launch --nproc_per_node=2 --master_port XXXX train.py --batch 8 /path/to/DATASET --name exp_name_step2 --size 512 --faceloss --epoch 10 --ckpt /path/to/step1/pretrained/model --finetune

Testing

To test the reposing model and generate the reposing results:

python test.py /path/to/DATASET --pretrained_model /path/to/step2/pretrained/model --size 512 --save_path /path/to/save/output

Output images will be saved in --save_path.

You can find our reposing output images here.

Evaluation

We follow the same evaluation code as Global-Flow-Local-Attention.

Bibtex

Please consider citing our work if you find it useful for your research:

@article{albahar2021pose,
    title   = {Pose with {S}tyle: {D}etail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN},
  author  = {AlBahar, Badour and Lu, Jingwan and Yang, Jimei and Shu, Zhixin and Shechtman, Eli and Huang, Jia-Bin},
    journal = {ACM Transactions on Graphics},
  year    = {2021}
}

Acknowledgments

This code is heavily borrowed from Rosinality: StyleGAN 2 in PyTorch.

Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstrac

2 Apr 14, 2022
Code for Domain Adaptive Video Segmentation via Temporal Consistency Regularization in ICCV 2021

Domain Adaptive Video Segmentation via Temporal Consistency Regularization Updates 08/2021: check out our domain adaptation for sematic segmentation p

36 Dec 12, 2022
An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

Abhinav Atrishi 11 Nov 25, 2022
PyTorch Implementation of Vector Quantized Variational AutoEncoders.

Pytorch implementation of VQVAE. This paper combines 2 tricks: Vector Quantization (check out this amazing blog for better understanding.) Straight-Th

Vrushank Changawala 2 Oct 06, 2021
Parameter-ensemble-differential-evolution - Shows how to do parameter ensembling using differential evolution.

Ensembling parameters with differential evolution This repository shows how to ensemble parameters of two trained neural networks using differential e

Sayak Paul 9 May 04, 2022
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

356 Dec 23, 2022
Multi-Scale Geometric Consistency Guided Multi-View Stereo

ACMM [News] The code for ACMH is released!!! [News] The code for ACMP is released!!! About ACMM is a multi-scale geometric consistency guided multi-vi

Qingshan Xu 118 Jan 04, 2023
multimodal transformer

This repo holds the code to perform experiments with the multimodal autoregressive probabilistic model Transflower. Overview of the repo It is structu

Guillermo Valle 68 Dec 13, 2022
Per-Pixel Classification is Not All You Need for Semantic Segmentation

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation Bowen Cheng, Alexander G. Schwing, Alexander Kirillov [arXiv] [Proj

Facebook Research 1k Jan 08, 2023
A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Awesome AutoDL A curated list of automated deep learning related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awe

D-X-Y 2k Dec 30, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
PyTorch source code for Distilling Knowledge by Mimicking Features

LSHFM.detection This is the PyTorch source code for Distilling Knowledge by Mimicking Features. And this project contains code for object detection wi

Guo-Hua Wang 4 Dec 17, 2022
EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

Jangwon Park 34 Jan 01, 2023
The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

Yufei Wang 176 Jan 06, 2023
A Python library for Deep Graph Networks

PyDGN Wiki Description This is a Python library to easily experiment with Deep Graph Networks (DGNs). It provides automatic management of data splitti

Federico Errica 194 Dec 22, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 02, 2023
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

[Paper] [Хабр] [Model Card] [Colab] [Kaggle] RuDOLPH 🦌 🎄 ☃️ One Hyper-Modal Tr

Sber AI 230 Dec 31, 2022
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022