[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Overview

Anycost GAN

video | paper | website

Anycost GANs for Interactive Image Synthesis and Editing

Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

MIT, Adobe Research, CMU

In CVPR 2021

flexible

Anycost GAN generates consistent outputs under various computational budgets.

Demo

Here, we can use the Anycost generator for interactive image editing. A full generator takes ~3s to render an image, which is too slow for editing. While with Anycost generator, we can provide a visually similar preview at 5x faster speed. After adjustment, we hit the "Finalize" button to synthesize the high-quality final output. Check here for the full demo.

Overview

Anycost generators can be run at diverse computation costs by using different channel and resolution configurations. Sub-generators achieve high output consistency compared to the full generator, providing a fast preview.

overview

With (1) Sampling-based multi-resolution training, (2) adaptive-channel training, and (3) generator-conditioned discriminator, we achieve high image quality and consistency at different resolutions and channels.

method

Results

Anycost GAN (uniform channel version) supports 4 resolutions and 4 channel ratios, producing visually consistent images with different image fidelity.

uniform

The consistency retains during image projection and editing:

Usage

Getting Started

  • Clone this repo:
git clone https://github.com/mit-han-lab/anycost-gan.git
cd anycost-gan
  • Install PyTorch 1.7 and other dependeinces.

We recommend setting up the environment using Anaconda: conda env create -f environment.yml

Introduction Notebook

We provide a jupyter notebook example to show how to use the anycost generator for image synthesis at diverse costs: notebooks/intro.ipynb.

We also provide a colab version of the notebook: . Be sure to select the GPU as the accelerator in runtime options.

Interactive Demo

We provide an interactive demo showing how we can use anycost GAN to enable interactive image editing. To run the demo:

python demo.py

You can find a video recording of the demo here.

Using Pre-trained Models

To get the pre-trained generator, encoder, and editing directions, run:

import model

pretrained_type = 'generator'  # choosing from ['generator', 'encoder', 'boundary']
config_name = 'anycost-ffhq-config-f'  # replace the config name for other models
model.get_pretrained(pretrained_type, config=config_name)

We also provide the face attribute classifier (which is general for different generators) for computing the editing directions. You can get it by running:

model.get_pretrained('attribute-predictor')

The attribute classifier takes in the face images in FFHQ format.

After loading the Anycost generator, we can run it at a wide range of computational costs. For example:

from model.dynamic_channel import set_uniform_channel_ratio, reset_generator

g = model.get_pretrained('generator', config='anycost-ffhq-config-f')  # anycost uniform
set_uniform_channel_ratio(g, 0.5)  # set channel
g.target_res = 512  # set resolution
out, _ = g(...)  # generate image
reset_generator(g)  # restore the generator

For detailed usage and flexible-channel anycost generator, please refer to notebooks/intro.ipynb.

Model Zoo

Currently, we provide the following pre-trained generators, encoders, and editing directions. We will add more in the future.

For Anycost generators, by default, we refer to the uniform setting.

config name generator encoder edit direction
anycost-ffhq-config-f ✔️ ✔️ ✔️
anycost-ffhq-config-f-flexible ✔️ ✔️ ✔️
anycost-car-config-f ✔️
stylegan2-ffhq-config-f ✔️ ✔️ ✔️

stylegan2-ffhq-config-f refers to the official StyleGAN2 generator converted from the repo.

Datasets

We prepare the FFHQ, CelebA-HQ, and LSUN Car datasets into a directory of images, so that it can be easily used with ImageFolder from torchvision. The dataset layout looks like:

├── PATH_TO_DATASET
│   ├── images
│   │   ├── 00000.png
│   │   ├── 00001.png
│   │   ├── ...

Due to the copyright issue, you need to download the dataset from official site and process them accordingly.

Evaluation

We provide the code to evaluate some metrics presented in the paper. Some of the code is written with horovod to support distributed evaluation and reduce the cost of inter-GPU communication, which greatly improves the speed. Check its website for a proper installation.

Fre ́chet Inception Distance (FID)

Before evaluating the FIDs, you need to compute the inception features of the real images using scripts like:

python tools/calc_inception.py \
    --resolution 1024 --batch_size 64 -j 16 --n_sample 50000 \
    --save_name assets/inceptions/inception_ffhq_res1024_50k.pkl \
    PATH_TO_FFHQ

or you can download the pre-computed inceptions from here and put it under assets/inceptions.

Then, you can evaluate the FIDs by running:

horovodrun -np N_GPU \
    python metrics/fid.py \
    --config anycost-ffhq-config-f \
    --batch_size 16 --n_sample 50000 \
    --inception assets/inceptions/inception_ffhq_res1024_50k.pkl
    # --channel_ratio 0.5 --target_res 512  # optionally using a smaller resolution/channel

Perceptual Path Lenght (PPL)

Similary, evaluting the PPL with:

horovodrun -np N_GPU \
    python metrics/ppl.py \
    --config anycost-ffhq-config-f

Attribute Consistency

Evaluating the attribute consistency by running:

horovodrun -np N_GPU \
    python metrics/attribute_consistency.py \
    --config anycost-ffhq-config-f \
    --channel_ratio 0.5 --target_res 512  # config for the sub-generator; necessary

Encoder Evaluation

To evaluate the performance of the encoder, run:

python metrics/eval_encoder.py \
    --config anycost-ffhq-config-f \
    --data_path PATH_TO_CELEBA_HQ

Training

The training code will be updated shortly.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{lin2021anycost,
  author    = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},
  title     = {Anycost GANs for Interactive Image Synthesis and Editing},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2021},
}

Related Projects

GAN Compression | Once for All | iGAN | StyleGAN2

Acknowledgement

We thank Taesung Park, Zhixin Shu, Muyang Li, and Han Cai for the helpful discussion. Part of the work is supported by NSF CAREER Award #1943349, Adobe, Naver Corporation, and MIT-IBM Watson AI Lab.

The codebase is build upon a PyTorch implementation of StyleGAN2: rosinality/stylegan2-pytorch. For editing direction extraction, we refer to InterFaceGAN.

Owner
MIT HAN Lab
Accelerating Deep Learning Computing
MIT HAN Lab
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

Sihyun Yu 147 Dec 31, 2022
Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data

JumpDetectR Name of QuantLet : JumpDetectR Published in : 'To be published as "Jump dynamics in high frequency crypto markets"' Description : 'Scala

LvB 12 Jan 01, 2023
Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 03, 2023
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 219 Dec 28, 2022
GUI for TOAD-GAN, a PCG-ML algorithm for Token-based Super Mario Bros. Levels.

If you are using this code in your own project, please cite our paper: @inproceedings{awiszus2020toadgan, title={TOAD-GAN: Coherent Style Level Gene

Maren A. 13 Dec 14, 2022
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
Code repository for paper `Skeleton Merger: an Unsupervised Aligned Keypoint Detector`.

Skeleton Merger Skeleton Merger, an Unsupervised Aligned Keypoint Detector. The paper is available at https://arxiv.org/abs/2103.10814. A map of the r

北海若 48 Nov 14, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

SimMIM By Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai and Han Hu*. This repo is the official implementation of

Microsoft 674 Dec 26, 2022
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Model This repository is the official PyTorch implementation of GraphRNN, a graph gene

Jiaxuan 568 Dec 29, 2022
A curated list of neural network pruning resources.

A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.

Yang He 1.7k Jan 09, 2023
Codebase of deep learning models for inferring stability of mRNA molecules

Kaggle OpenVaccine Models Codebase of deep learning models for inferring stability of mRNA molecules, corresponding to the Kaggle Open Vaccine Challen

Eternagame 40 Dec 29, 2022
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We h

97 Dec 01, 2022
CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling

Clothing Co-Parsing (CCP) Dataset Clothing Co-Parsing (CCP) dataset is a new clothing database including elaborately annotated clothing items. 2, 098

Wei Yang 434 Dec 24, 2022
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 46 Dec 26, 2022
Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode

Pratik Kayal 109 Dec 29, 2022
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Jussi Doherty 1 Jan 03, 2022
Multi-View Radar Semantic Segmentation

Multi-View Radar Semantic Segmentation Paper Multi-View Radar Semantic Segmentation, ICCV 2021. Arthur Ouaknine, Alasdair Newson, Patrick Pérez, Flore

valeo.ai 37 Oct 25, 2022
[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs In this work, we propose a framework HijackGAN, which enables non-linear latent space travers

Hui-Po Wang 46 Sep 05, 2022
A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

APPNP ⠀ A PyTorch implementation of Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019). Abstract Neural message pass

Benedek Rozemberczki 329 Dec 30, 2022