Discovering Interpretable GAN Controls [NeurIPS 2020]

Overview

GANSpace: Discovering Interpretable GAN Controls

Python 3.7 PyTorch 1.3 Open In Colab teaser

Figure 1: Sequences of image edits performed using control discovered with our method, applied to three different GANs. The white insets specify the particular edits using notation explained in Section 3.4 ('Layer-wise Edits').

GANSpace: Discovering Interpretable GAN Controls
Erik Härkönen1,2, Aaron Hertzmann2, Jaakko Lehtinen1,3, Sylvain Paris2
1Aalto University, 2Adobe Research, 3NVIDIA
https://arxiv.org/abs/2004.02546

Abstract: This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied in activation space. Then, we show that interpretable edits can be defined based on layer-wise application of these edit directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. A user may identify a large number of interpretable controls with these mechanisms. We demonstrate results on GANs from various datasets.

Video: https://youtu.be/jdTICDa_eAI

Setup

See the setup instructions.

Usage

This repository includes versions of BigGAN, StyleGAN, and StyleGAN2 modified to support per-layer latent vectors.

Interactive model exploration

# Explore BigGAN-deep husky
python interactive.py --model=BigGAN-512 --class=husky --layer=generator.gen_z -n=1_000_000

# Explore StyleGAN2 ffhq in W space
python interactive.py --model=StyleGAN2 --class=ffhq --layer=style --use_w -n=1_000_000 -b=10_000

# Explore StyleGAN2 cars in Z space
python interactive.py --model=StyleGAN2 --class=car --layer=style -n=1_000_000 -b=10_000
# Apply previously saved edits interactively
python interactive.py --model=StyleGAN2 --class=ffhq --layer=style --use_w --inputs=out/directions

Visualize principal components

# Visualize StyleGAN2 ffhq W principal components
python visualize.py --model=StyleGAN2 --class=ffhq --use_w --layer=style -b=10_000

# Create videos of StyleGAN wikiart components (saved to ./out)
python visualize.py --model=StyleGAN --class=wikiart --use_w --layer=g_mapping -b=10_000 --batch --video

Options

Command line paramaters:
  --model      one of [ProGAN, BigGAN-512, BigGAN-256, BigGAN-128, StyleGAN, StyleGAN2]
  --class      class name; leave empty to list options
  --layer      layer at which to perform PCA; leave empty to list options
  --use_w      treat W as the main latent space (StyleGAN / StyleGAN2)
  --inputs     load previously exported edits from directory
  --sigma      number of stdevs to use in visualize.py
  -n           number of PCA samples
  -b           override automatic minibatch size detection
  -c           number of components to keep

Reproducibility

All figures presented in the main paper can be recreated using the included Jupyter notebooks:

  • Figure 1: figure_teaser.ipynb
  • Figure 2: figure_pca_illustration.ipynb
  • Figure 3: figure_pca_cleanup.ipynb
  • Figure 4: figure_style_content_sep.ipynb
  • Figure 5: figure_supervised_comp.ipynb
  • Figure 6: figure_biggan_style_resampling.ipynb
  • Figure 7: figure_edit_zoo.ipynb

Known issues

  • The interactive viewer sometimes freezes on startup on Ubuntu 18.04. The freeze is resolved by clicking on the terminal window and pressing the control key. Any insight into the issue would be greatly appreciated!

Integrating a new model

  1. Create a wrapper for the model in models/wrappers.py using the BaseModel interface.
  2. Add the model to get_model() in models/wrappers.py.

Importing StyleGAN checkpoints from TensorFlow

It is possible to import trained StyleGAN and StyleGAN2 weights from TensorFlow into GANSpace.

StyleGAN

  1. Install TensorFlow: conda install tensorflow-gpu=1.*.
  2. Modify methods __init__(), load_model() in models/wrappers.py under class StyleGAN.

StyleGAN2

  1. Follow the instructions in models/stylegan2/stylegan2-pytorch/README.md. Make sure to use the fork in this specific folder when converting the weights for compatibility reasons.
  2. Save the converted checkpoint as checkpoints/stylegan2/<dataset>_<resolution>.pt.
  3. Modify methods __init__(), download_checkpoint() in models/wrappers.py under class StyleGAN2.

Acknowledgements

We would like to thank:

  • The authors of the PyTorch implementations of BigGAN, StyleGAN, and StyleGAN2:
    Thomas Wolf, Piotr Bialecki, Thomas Viehmann, and Kim Seonghyeon.
  • Joel Simon from ArtBreeder for providing us with the landscape model for StyleGAN.
    (unfortunately we cannot distribute this model)
  • David Bau and colleagues for the excellent GAN Dissection project.
  • Justin Pinkney for the Awesome Pretrained StyleGAN collection.
  • Tuomas Kynkäänniemi for giving us a helping hand with the experiments.
  • The Aalto Science-IT project for providing computational resources for this project.

Citation

@inproceedings{härkönen2020ganspace,
  title     = {GANSpace: Discovering Interpretable GAN Controls},
  author    = {Erik Härkönen and Aaron Hertzmann and Jaakko Lehtinen and Sylvain Paris},
  booktitle = {Proc. NeurIPS},
  year      = {2020}
}

License

The code of this repository is released under the Apache 2.0 license.
The directory netdissect is a derivative of the GAN Dissection project, and is provided under the MIT license.
The directories models/biggan and models/stylegan2 are provided under the MIT license.

Owner
Erik Härkönen
PhD student at Aalto University
Erik Härkönen
Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

CoTuning Official implementation for NeurIPS 2020 paper Co-Tuning for Transfer Learning. [News] 2021/01/13 The COCO 70 dataset used in the paper is av

THUML @ Tsinghua University 35 Sep 23, 2022
This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis, accepted at ACMMM 2021.

Ziqi Yuan 10 Sep 30, 2022
Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Viewmaker Networks: Learning Views for Unsupervised Representation Learning Alex Tamkin, Mike Wu, and Noah Goodman Paper link: https://arxiv.org/abs/2

Alex Tamkin 31 Dec 01, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Introduction This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection. Up

133 Dec 29, 2022
Weakly-supervised semantic image segmentation with CNNs using point supervision

Code for our ECCV paper What's the Point: Semantic Segmentation with Point Supervision. Summary This library is a custom build of Caffe for semantic i

27 Sep 14, 2022
Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021) authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano Overv

Facebook Research 462 Jan 03, 2023
NEO: Non Equilibrium Sampling on the orbit of a deterministic transform

NEO: Non Equilibrium Sampling on the orbit of a deterministic transform Description of the code This repo describes the NEO estimator described in the

0 Dec 01, 2021
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

144 Dec 24, 2022
Code for the paper "On the Power of Edge Independent Graph Models"

Edge Independent Graph Models Code for the paper: "On the Power of Edge Independent Graph Models" Sudhanshu Chanpuriya, Cameron Musco, Konstantinos So

Konstantinos Sotiropoulos 0 Oct 26, 2021
PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering Jang Hyun Cho1, Utkarsh Mall2, Kavita Bala2, Bharath Harihar

Jang Hyun Cho 164 Dec 30, 2022
[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

G-PATE This is the official code base for our NeurIPS 2021 paper: "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of T

AI Secure 14 Oct 12, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

Phillip Lippe 1.1k Jan 07, 2023
A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

THUDM 58 Dec 17, 2022
Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Official Code Implementation of The Paper : XAI for Transformers: Better Explanations through Conservative Propagation For the SST-2 and IMDB expermin

Ameen Ali 23 Dec 30, 2022
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for

UBC Computer Vision Group 8 Aug 31, 2022
The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

Rootroo Ltd 2 Dec 25, 2021
Using CNN to mimic the driver based on training data from Torcs

Behavioural-Cloning-in-autonomous-driving Using CNN to mimic the driver based on training data from Torcs. Approach First, the data was collected from

Sudharshan 2 Jan 05, 2022
Examples of how to create colorful, annotated equations in Latex using Tikz.

The file "eqn_annotate.tex" is the main latex file. This repository provides four examples of annotated equations: [example_prob.tex] A simple one ins

SyNeRCyS Research Lab 3.2k Jan 05, 2023