Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Related tags

Deep LearningIALS
Overview

Instance-Aware Latent-Space Search

This is a PyTorch implementation of the following paper:

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, IJCAI 2021.

Yuxuan Han, Jiaolong Yang and Ying Fu

Paper: https://arxiv.org/abs/2105.12660.

Abstract: Recent works have shown that a rich set of semantic directions exist in the latent space of Generative Adversarial Networks (GANs), which enables various facial attribute editing applications. However, existing methods may suffer poor attribute variation disentanglement, leading to unwanted change of other attributes when altering the desired one. The semantic directions used by existing methods are at attribute level, which are difficult to model complex attribute correlations, especially in the presence of attribute distribution bias in GAN’s training set. In this paper, we propose a novel framework (IALS) that performs Instance-Aware Latent-Space Search to find semantic directions for disentangled attribute editing. The instance information is injected by leveraging the supervision from a set of attribute classifiers evaluated on the input images. We further propose a Disentanglement-Transformation (DT) metric to quantify the attribute transformation and disentanglement efficacy and find the optimal control factor between attribute-level and instance-specific directions based on it. Experimental results on both GAN-generated and real-world images collectively show that our method outperforms state-of-the-art methods proposed recently by a wide margin.

Requirements

It's quite easy to create the environment for our model, you only need:

  • Python 3.7 and the basic Anaconda3 environment.
  • PyTorch 1.x with GPU support (a single NVIDIA GTX 1060 is enough).
  • The tqdm library to visualize the progress bar.

Reproduce Results

Download the pretrain directory from here and put it on the root directory of this repository. If your environment meets our requirements, you will see an editing result in test_env.jpg using the following command.

python edit_single_attr.py --seed 0 --step 0.5 --n_steps 4 --dataset ffhq --base interfacegan --attr male --save_path test_env.jpg
  • Edit a random image generated by StyleGAN. You can specify the primal and condition attributes and the seed. Here we set gender as the primal attribute and expression as the condition attribute.
# reproduce our results:
python condition_manipulation.py --seed 0 --step 0.1 --n_steps 30 --dataset ffhq --base interfacegan --attr1 male --attr2 smiling --lambda1 0.75 --lambda2 0 --real_image 0 --save_path rand-ours.jpg

# reproduce interfacegan results:
python condition_manipulation.py --seed 0 --step 0.1 --n_steps 30 --dataset ffhq --base interfacegan --attr1 male --attr2 smiling --lambda1 1 --lambda2 1 --real_image 0 --save_path rand-inter.jpg
  • Edit a real face image via our instance-aware direction. In the pretrain\real_latent_code folder we put lots of pretrained latent code provided by seeprettyface. If you want to edit customized face images, please refer to the next section. Note: If lambda1=lambda2=1, our method degrades to the attribute-level semantic direction based methods like InterfaceGAN and GANSpace.
# reproduce our results:
python condition_manipulation.py --seed 0 --step -0.1 --n_steps 30 --dataset ffhq --base interfacegan --attr1 young --attr2 eyeglasses --lambda1 0.75 --lambda2 0 --real_image 1 --latent_code_path pretrain\real_latent_code\real1.npy --save_path real-ours.jpg

# reproduce interfacegan results: 
python condition_manipulation.py --seed 0 --step -0.1 --n_steps 30 --dataset ffhq --base interfacegan --attr1 young --attr2 eyeglasses --lambda1 1 --lambda2 1 --real_image 1 --latent_code_path pretrain\real_latent_code\real1.npy --save_path real-inter.jpg
  • Compute the attribute-level direction by average the instance-specific direction.
python train_attr_level_direction.py --n_images 500 --attr pose

Editing Your Own Image

Typically you need to follow the steps below:

  1. Obtain the latent code of the real image via GAN Inversion. Here we provide a simple baseline GAN-Inversion method in gan_inversion.py.
python gan_inversion.py --n_iters 500 --img_path image\real_face_sample.jpg
  1. Editing the real face image's latent code with our method.
python condition_manipulation.py --seed 0 --step -0.1 --n_steps 10 --dataset ffhq --base interfacegan --attr1 male --attr2 smiling --lambda1 0.75 --lambda2 0 --real_image 1 --latent_code_path rec.npy --save_path real-ours.jpg

You will see the result like that:

To improve the editing quality, we highly recommand you to use the state-of-the-art GAN inversion method like Id-Invert or pixel2image2pixel. Note: You need to make sure that these GAN inversion methods use the same pretrained StyleGAN weights as us.

Contact

If you have any questions, please contact Yuxuan Han ([email protected]).

Citation

Please cite the following paper if this model helps your research:

@inproceedings{han2021IALS,
    title={Disentangled Face Attribute Editing via Instance-Aware Latent Space Search},
    author={Yuxuan Han, Jiaolong Yang and Ying Fu},
    booktitle={International Joint Conference on Artificial Intelligence},
    year={2021}
}

Acknowledgments

This code borrows the StyleGAN generator implementation from https://github.com/lernapparat/lernapparat and uses the pretrained real image's latent code provided by http://www.seeprettyface.com/index_page6.html. We thank for their great effort!

Owner
Currently a junior student at BIT, interested in computer vision and graphics.
Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

streamlit-manim Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit Installation I had to install pango with sudo apt-get

Adrien Treuille 6 Aug 03, 2022
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
Baseline and template code for node21 detection track

Nodule Detection Algorithm This codebase implements a baseline model, Faster R-CNN, for the nodule detection track in NODE21. It contains all necessar

node21challenge 11 Jan 15, 2022
Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

Phan Hoang 39 Dec 16, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
Denoising Diffusion Implicit Models

Denoising Diffusion Implicit Models (DDIM) Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford Implements sampling from an implicit model that is t

465 Jan 05, 2023
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

CutPaste CutPaste: image from paper Unofficial implementation of Google's "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization"

Lilit Yolyan 59 Nov 27, 2022
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition

On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition With the spirit of reproducible research, this repository contains codes requ

0 Feb 24, 2022
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
Non-Vacuous Generalisation Bounds for Shallow Neural Networks

This package requires jax, tensorflow, and numpy. Either tensorflow or scikit-learn can be used for loading data. To run in a nix-shell with required

Felix Biggs 0 Feb 04, 2022
Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

This repository contains the scripts needed to generate the results from the paper Unity Propagation in Bayesian Networks Handling Inconsistency via U

0 Jan 19, 2022
Simple SN-GAN to generate CryptoPunks

CryptoPunks GAN Simple SN-GAN to generate CryptoPunks. Neural network architecture and training code has been modified from the PyTorch DCGAN example.

Teddy Koker 66 Dec 15, 2022
这是一个deeplabv3-plus-pytorch的源码,可以用于训练自己的模型。

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 训练步骤

Bubbliiiing 350 Dec 28, 2022
GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications

GPOEO GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications. We also implement ODPP [1] as a comparison. [1]

瑞雪轻飏 8 Sep 10, 2022
Fuse radar and camera for detection

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor This project hosts the code for implementing the SAF-FC

ChangShuo 18 Jan 01, 2023
Implementation of various Vision Transformers I found interesting

Implementation of various Vision Transformers I found interesting

Kim Seonghyeon 78 Dec 06, 2022
Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder Authors: - Eashan Adhikarla - Dan Luo - Dr. Brian D. Davison Abstract Many

Eashan Adhikarla 4 Dec 25, 2022
Official git for "CTAB-GAN: Effective Table Data Synthesizing"

CTAB-GAN This is the official git paper CTAB-GAN: Effective Table Data Synthesizing. The paper is published on Asian Conference on Machine Learning (A

30 Dec 26, 2022