Pytorch implementation of few-shot semantic image synthesis

Last update: Sep 26, 2022

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Our method can synthesize photorealistic images from dense or sparse semantic annotations using a few training pairs and a pre-trained StyleGAN.

Prerequisites

Python3
PyTorch

Preparation

Download and decompress the file containing StyleGAN pre-trained models and put the "pretrained_models" directory in the parent directory.

Inference with our pre-trained models

Download and decompress the file containing our pretrained encoders and put the "results" directory in the parent directory.
For example, our results for celebaMaskHQ in a one-shot setting can be generated as follows:

python scripts/inference.py --exp_dir=results/celebaMaskHQ_oneshot --checkpoint_path=results/celebaMaskHQ_oneshot/checkpoints/iteration_100000.pt --data_path=./data/CelebAMask-HQ/test/labels/ --couple_outputs --latent_mask=8,9,10,11,12,13,14,15,16,17

Inference results are generated in results/celebaMaskHQ_oneshot. If you use other datasets, please specify --exp_dir, --checkpoint_path, and --data_path appropriately.

Training

For each dataset, you can train an encoder as follows:

CelebAMask

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_seg_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 19 --input_nc 19

CelebALandmark

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_landmark_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 71 --input_nc 71 --sparse_labeling

Intermediate training outputs with the StyleGAN pre-trained with the CelebA-HQ dataset. It can be seen that the layouts of the bottom-row images reconstructed from the middle-row pseudo semantic masks gradually become close to those of the top-row StyleGAN samples as the training iterations increase.

LSUN church

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsunchurch_seg_to_img --stylegan_weights pretrained_models/stylegan2-church-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 151 --input_nc 151

LSUN car

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncar_seg_to_img --stylegan_weights pretrained_models/stylegan2-car-config-f.pt --style_num 16 --start_from_latent_avg --label_nc 5 --input_nc 5

LSUN cat

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncat_scribble_to_img --stylegan_weights pretrained_models/stylegan2-cat-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 9 --input_nc 9 --sparse_labeling

Ukiyo-e

python scripts/train.py --exp_dir=[result_dir] --dataset_type=ukiyo-e_scribble_to_img --stylegan_weights pretrained_models/ukiyoe-256-slim-diffAug-002789.pt --style_num 14 --channel_multiplier 1 --start_from_latent_avg --label_nc 8 --input_nc 8 --sparse_labeling

Anime

python scripts/train.py --exp_dir=[result_dir] --dataset_type=anime_cross_to_img --stylegan_weights pretrained_models/2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pt --style_num 16 --start_from_latent_avg --label_nc 2 --input_nc 2 --sparse_labeling

Using StyleGAN samples as few-shot training data

Run the following script:

python scripts/generate_stylegan_samples.py --exp_dir=[result_dir] --stylegan_weights ./pretrained_models/stylegan2-ffhq-config-f.pt --style_num 18 --channel_multiplier 2

Then a StyleGAN image (*.png) and a corresponding latent code (*.pt) are obtained in [result_dir]/data/images and [result_dir]/checkpoints.

Manually annotate the generated image in [result_dir]/data/images and save the annotated mask in [result_dir]/data/labels.
Edit ./config/data_configs.py and ./config/paths_config.py appropriately to use the annotated pairs as a training set.
Run a training command above with appropriate options.

Citation

Please cite our paper if you find the code useful:

@article{endo2021fewshotsmis,
  title = {Few-shot Semantic Image Synthesis Using StyleGAN Prior},
  author = {Yuki Endo and Yoshihiro Kanamori},
  journal   = {CoRR},
  volume    = {abs/2103.14877},
  year      = {2021}
}

Acknowledgements

This code heavily borrows from the pixel2style2pixel repository.

Pytorch implementation of few-shot semantic image synthesis

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Prerequisites

Preparation

Inference with our pre-trained models

Training

Using StyleGAN samples as few-shot training data

Citation

Acknowledgements

Owner

Graph Analysis From Scratch

NeoPlay is the project dedicated to ESport events.

Progressive Domain Adaptation for Object Detection

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Implementation of Pix2Seq in PyTorch

The code for paper "Learning Implicit Fields for Generative Shape Modeling".

IsoGCN code for ICLR2021

a generic C++ library for image analysis

Single Image Super-Resolution (SISR) with SRResNet, EDSR and SRGAN

Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)

Tensorboard for pytorch (and chainer, mxnet, numpy, ...)

Locally Constrained Self-Attentive Sequential Recommendation

BlueFog Tutorials

MEDS: Enhancing Memory Error Detection for Large-Scale Applications

🛰️ List of earth observation companies and job sites

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).