Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Overview

License CC BY-NC-SA 4.0

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

fig

HiSD is the SOTA image-to-image translation method for both Scalability for multiple labels and Controllable Diversity with impressive disentanglement.

The styles to manipolate each tag in our method can be not only generated by random noise but also extracted from images!

Also, the styles can be smoothly interpolated like:

reference

All tranlsations are producted be a unified HiSD model and trained end-to-end.

Easy Use (for Both Jupyter Notebook and Python Script)

Download the pretrained checkpoint in Baidu Drive (Password:ihxf) or Google Drive. Then put it into the root of this repo.

Open "easy_use.ipynb" and you can manipolate the facial attributes by yourself!

If you haven't installed Jupyter, use "easy_use.py".

The script will translate "examples/input_0.jpg" to be with bangs generated by a random noise and glasses extracted from "examples/reference_glasses_0.jpg"

Quick Start

Clone this repo:

git clone https://github.com/imlixinyang/HiSD.git
cd HiSD/

Install the dependencies: (Anaconda is recommended.)

conda create -n HiSD python=3.6.6
conda activate HiSD
conda install -y pytorch=1.0.1 torchvision=0.2.2  cudatoolkit=10.1 -c pytorch
pip install pillow tqdm tensorboardx pyyaml

Download the dataset.

We recommend you to download CelebA-HQ from CelebAMask-HQ. Anyway you shound get the dataset folder like:

celeba_or_celebahq
 - img_dir
   - img0
   - img1
   - ...
 - train_label.txt

Preprocess the dataset.

In our paper, we use fisrt 3000 as test set and remaining 27000 for training. Carefully check the fisrt few (always two) lines in the label file which is not like the others.

python proprecessors/celeba-hq.py --img_path $your_image_path --label_path $your_label_path --target_path datasets --start 3002 --end 30002

Then you will get several ".txt" files in the "datasets/", each of them consists of lines of the absolute path of image and its tag-irrelevant conditions (Age and Gender by default).

Almost all custom datasets can be converted into special cases of HiSD. We provide a script for custom datasets. You need to organize the folder like:

your_training_set
 - Tag0
   - attribute0
     - img0
     - img1
     - ...
   - attribute1
     - ...
 - Tag1
 - ...

For example, the AFHQ (one tag and three attributes, remember to split the training and test set first):

AFHQ_training
  - Category
    - cat
      - img0
      - img1
      - ...
    - dog
      - ...
    - wild
      - ...

You can Run

python proprecessors/custom.py --imgs $your_training_set --target_path datasets/custom.txt

For other datasets, please code the preprocessor by yourself.

Here, we provide some links for you to download other available datasets:

Dataset in Bold means we have tested the generalization of HiSD for this dataset.

Train.

Following "configs/celeba-hq.yaml" to make the config file fit your machine and dataset.

For a single 1080Ti and CelebA-HQ, you can directly run:

python core/train.py --config configs/celeba-hq.yaml --gpus 0

The samples and checkpoints are in the "outputs/" dir. For Celeba-hq dataset, the samples during first 200k iterations will be like: (tag 'Glasses' to attribute 'with')

training

Test.

Modify the 'steps' dict in the first few lines in 'core/test.py' and run:

python core/test.py --config configs/celeba-hq.yaml --checkpoint $your_checkpoint --input_path $your_input_path --output_path results

$your_input_path can be either a image file or a folder of images. Default 'steps' make every image to be with bangs and glasses using random latent-guided styles.

Evaluation metrics.

We use FID for quantitative comparison. For more details, please refer to the paper.

License

Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For other use, please contact me at [email protected].

Citation

If our paper helps your research, please cite it in your publications:

@misc{li2021imagetoimage,
      title={Image-to-image Translation via Hierarchical Style Disentanglement}, 
      author={Xinyang Li and Shengchuan Zhang and Jie Hu and Liujuan Cao and Xiaopeng Hong and Xudong Mao and Feiyue Huang and Yongjian Wu and Rongrong Ji},
      year={2021},
      eprint={2103.01456},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

I try my best to make the code easy to understand or further modified because I feel very lucky to start with the clear and readily comprehensible code of MUNIT when I'm a beginner.

If you have any problem, please feel free to contact me at [email protected] or raise an issue.

Related Work

[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022
A small fun project using python OpenCV, mediapipe, and pydirectinput

Here I tried a small fun project using python OpenCV, mediapipe, and pydirectinput. Here we can control moves car game when yellow color come to right box (press key 'd') left box (press key 'a') lef

Sameh Elisha 3 Nov 17, 2022
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features"

Pytorch Implementation of Deep Orthogonal Fusion of Local and Global Features (DOLG) This is the unofficial PyTorch Implementation of "DOLG: Single-St

DK 96 Jan 06, 2023
Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

Patrick Varilly 28 Nov 25, 2022
A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

Wenhao Hu 94 Jan 06, 2023
Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Official code of Retinal Vessel Segmentation with Pixel-wise Adaptive Filters and Consistency Training (ISBI 2022)

anonymous 14 Oct 27, 2022
Machine Learning Toolkit for Kubernetes

Kubeflow the cloud-native platform for machine learning operations - pipelines, training and deployment. Documentation Please refer to the official do

Kubeflow 12.1k Jan 03, 2023
Official Repository for our ICCV2021 paper: Continual Learning on Noisy Data Streams via Self-Purified Replay

Continual Learning on Noisy Data Streams via Self-Purified Replay This repository contains the official PyTorch implementation for our ICCV2021 paper.

Jinseo Jeong 22 Nov 23, 2022
This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize over continuous domains by Brandon Amos

Tutorial on Amortized Optimization This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize

Meta Research 144 Dec 26, 2022
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio

Hyunwoo Kim 51 Jan 06, 2023
JupyterNotebook - C/C++, Javascript, HTML, LaTex, Shell scripts in Jupyter Notebook Also run them on remote computer

JupyterNotebook Read, write and execute C, C++, Javascript, Shell scripts, HTML, LaTex in jupyter notebook, And also execute them on remote computer R

1 Jan 09, 2022
A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

Aladdin Persson 4.7k Jan 08, 2023
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
PIXIE: Collaborative Regression of Expressive Bodies

PIXIE: Collaborative Regression of Expressive Bodies [Project Page] This is the official Pytorch implementation of PIXIE. PIXIE reconstructs an expres

Yao Feng 331 Jan 04, 2023
Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation Prerequisites This repo is built upon a local copy of transfo

Jixuan Wang 10 Sep 28, 2022
cl;asification problem using classification models in supervised learning

wine-quality-predition---classification cl;asification problem using classification models in supervised learning Wine Quality Prediction Analysis - C

Vineeth Reddy Gangula 1 Jan 18, 2022
Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

This repository contains code to reproduce results for submission NeurIPS 2021, "Momentum Centering and Asynchronous Update for Adaptive Gradient Meth

Juntang Zhuang 15 Jun 11, 2022
Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Kai Zhang 1.2k Dec 29, 2022
BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applic

SAFARI Research Group at ETH Zurich and Carnegie Mellon University 19 Dec 26, 2022
Styled Handwritten Text Generation with Transformers (ICCV 21)

âš¡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

Ankan Kumar Bhunia 85 Dec 22, 2022