Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Overview

License CC BY-NC-SA 4.0

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

fig

HiSD is the SOTA image-to-image translation method for both Scalability for multiple labels and Controllable Diversity with impressive disentanglement.

The styles to manipolate each tag in our method can be not only generated by random noise but also extracted from images!

Also, the styles can be smoothly interpolated like:

reference

All tranlsations are producted be a unified HiSD model and trained end-to-end.

Easy Use (for Both Jupyter Notebook and Python Script)

Download the pretrained checkpoint in Baidu Drive (Password:ihxf) or Google Drive. Then put it into the root of this repo.

Open "easy_use.ipynb" and you can manipolate the facial attributes by yourself!

If you haven't installed Jupyter, use "easy_use.py".

The script will translate "examples/input_0.jpg" to be with bangs generated by a random noise and glasses extracted from "examples/reference_glasses_0.jpg"

Quick Start

Clone this repo:

git clone https://github.com/imlixinyang/HiSD.git
cd HiSD/

Install the dependencies: (Anaconda is recommended.)

conda create -n HiSD python=3.6.6
conda activate HiSD
conda install -y pytorch=1.0.1 torchvision=0.2.2  cudatoolkit=10.1 -c pytorch
pip install pillow tqdm tensorboardx pyyaml

Download the dataset.

We recommend you to download CelebA-HQ from CelebAMask-HQ. Anyway you shound get the dataset folder like:

celeba_or_celebahq
 - img_dir
   - img0
   - img1
   - ...
 - train_label.txt

Preprocess the dataset.

In our paper, we use fisrt 3000 as test set and remaining 27000 for training. Carefully check the fisrt few (always two) lines in the label file which is not like the others.

python proprecessors/celeba-hq.py --img_path $your_image_path --label_path $your_label_path --target_path datasets --start 3002 --end 30002

Then you will get several ".txt" files in the "datasets/", each of them consists of lines of the absolute path of image and its tag-irrelevant conditions (Age and Gender by default).

Almost all custom datasets can be converted into special cases of HiSD. We provide a script for custom datasets. You need to organize the folder like:

your_training_set
 - Tag0
   - attribute0
     - img0
     - img1
     - ...
   - attribute1
     - ...
 - Tag1
 - ...

For example, the AFHQ (one tag and three attributes, remember to split the training and test set first):

AFHQ_training
  - Category
    - cat
      - img0
      - img1
      - ...
    - dog
      - ...
    - wild
      - ...

You can Run

python proprecessors/custom.py --imgs $your_training_set --target_path datasets/custom.txt

For other datasets, please code the preprocessor by yourself.

Here, we provide some links for you to download other available datasets:

Dataset in Bold means we have tested the generalization of HiSD for this dataset.

Train.

Following "configs/celeba-hq.yaml" to make the config file fit your machine and dataset.

For a single 1080Ti and CelebA-HQ, you can directly run:

python core/train.py --config configs/celeba-hq.yaml --gpus 0

The samples and checkpoints are in the "outputs/" dir. For Celeba-hq dataset, the samples during first 200k iterations will be like: (tag 'Glasses' to attribute 'with')

training

Test.

Modify the 'steps' dict in the first few lines in 'core/test.py' and run:

python core/test.py --config configs/celeba-hq.yaml --checkpoint $your_checkpoint --input_path $your_input_path --output_path results

$your_input_path can be either a image file or a folder of images. Default 'steps' make every image to be with bangs and glasses using random latent-guided styles.

Evaluation metrics.

We use FID for quantitative comparison. For more details, please refer to the paper.

License

Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For other use, please contact me at [email protected].

Citation

If our paper helps your research, please cite it in your publications:

@misc{li2021imagetoimage,
      title={Image-to-image Translation via Hierarchical Style Disentanglement}, 
      author={Xinyang Li and Shengchuan Zhang and Jie Hu and Liujuan Cao and Xiaopeng Hong and Xudong Mao and Feiyue Huang and Yongjian Wu and Rongrong Ji},
      year={2021},
      eprint={2103.01456},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

I try my best to make the code easy to understand or further modified because I feel very lucky to start with the clear and readily comprehensible code of MUNIT when I'm a beginner.

If you have any problem, please feel free to contact me at [email protected] or raise an issue.

Related Work

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Project This repo has been populated by an initial template to help get you started. Please make sure to update the content to build a great experienc

Microsoft 674 Dec 26, 2022
A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

Ilya Kostrikov 300 Dec 11, 2022
On the adaptation of recurrent neural networks for system identification

On the adaptation of recurrent neural networks for system identification This repository contains the Python code to reproduce the results of the pape

Marco Forgione 3 Jan 13, 2022
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

EMDATA-AILAB 23 Dec 26, 2022
Next-gen Rowhammer fuzzer that uses non-uniform, frequency-based patterns.

Blacksmith Rowhammer Fuzzer This repository provides the code accompanying the paper Blacksmith: Scalable Rowhammering in the Frequency Domain that is

Computer Security Group @ ETH Zurich 173 Nov 16, 2022
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
PushForKiCad - AISLER Push for KiCad EDA

AISLER Push for KiCad Push your layout to AISLER with just one click for instant

AISLER 31 Dec 29, 2022
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 08, 2023
Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning Installation

Pytorch Lightning 1.6k Jan 08, 2023
This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

trimodal_person_verification This repository contains the code, and preprocessed dataset featured in "A Study of Multimodal Person Verification Using

ISSAI 7 Aug 31, 2022
Improving adversarial robustness by a coupling rejection strategy

Adversarial Training with Rectified Rejection The code for the paper Adversarial Training with Rectified Rejection. Environment settings and libraries

Tianyu Pang 29 Jan 06, 2023
CTRL-C: Camera calibration TRansformer with Line-Classification

CTRL-C: Camera calibration TRansformer with Line-Classification This repository contains the official code and pretrained models for CTRL-C (Camera ca

57 Nov 14, 2022
The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

ycj_project 1 Jan 18, 2022
这是一个unet-pytorch的源码,可以训练自己的模型

Unet:U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Downl

Bubbliiiing 567 Jan 05, 2023
Pytorch implementation of XRD spectral identification from COD database

XRDidentifier Pytorch implementation of XRD spectral identification from COD database. Details will be explained in the paper to be submitted to NeurI

Masaki Adachi 4 Jan 07, 2023
Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

Graph Convolution Simulator (GCS) Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions" Requirements: PyTor

yifan 10 Oct 18, 2022
Framework for evaluating ANNS algorithms on billion scale datasets.

Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py

Harsha Vardhan Simhadri 132 Dec 24, 2022
Supervised Classification from Text (P)

MSc-Thesis Module: Masters Research Thesis Language: Python Grade: 75 Title: An investigation of supervised classification of therapeutic process from

Matthew Laws 1 Nov 22, 2021
ICCV2021 Expert-Goal Trajectory Prediction

ICCV 2021: Where are you heading? Dynamic Trajectory Prediction with Expert Goal Examples This repository contains the code for the paper Where are yo

hz 21 Dec 12, 2022