On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Overview

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko

Overview

We replace linear shifts commonly used for image editing with a flow of a trainable Neural ODE in the latent space.

w' = NN(w; \theta)

The RHS of this Neural ODE is trained end-to-end using pre-trained attribute regressors by enforcing

  • change of the desired attribute;
  • invariance of remaining attributes.

Installation and usage

Data

Data required to use the code is available at this dropbox link (2.5Gb).

Path Description
data data hosted on Dropbox
  ├  models pretrained GAN models and attribute regressors
  ├  log pretrained nonlinear edits (Neural ODEs of depth 1) for a variety of attributes on CUB, FFHQ, Places2
  ├  data_to_rectify 100,000 precomputed pairs (w, R[G[w]]); i.e., style vectors and corresponding semantic attributes
  ├  configs parameters of StyleGAN 2 generators for each dataset (n_mlp, channel_width, etc)
    └  inverses precomputed inverses (elements of W-plus) for sample FFHQ images

To download and unpack the data run get_data.sh.

Training

We used torch 1.7 for training; however, the code should work for lower versions as well. An example training script to rectify all the attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--depth 1

For selected attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--dir 4 8 15 16 23 32 \
--depth 1

Custom dataset

For training on a custom dataset, you have to provide

  • Generator and attribute regressor weights
  • a dictionary {dataset}_all.pt (stored in data_to_rectify). It has the form {"ws": ws, "labels" : labels} with ws being a torch.Tensor of size N x 512 and labels is a torch.Tensor of size N x D, with D being the number of semantic factors. labels should be constructed by evaluating the corresponding attribute regressor on synthetic images generator(ws[i]). It is used to sample batches for training.

Visualization

Please see explore.ipynb for example visualizations. lib.utils.py contains a utility wrapper useful for building and loading the Neural ODE models (FlowFactory).

Restoring from checkpoint

= 1 corresponds to an MLP with depth layers odeblock.load_state_dict(...) # some style vector (generator.style(z)) w0 = ... # You can directly call odeint with torch.no_grad(): odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device)) # Or utilize the wrapper flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald") flow.flow(w=w0, t=1) # To flow real images: w = torch.load("inverses/actors.pt").to(device) flow.flow(w, t=6, truncate_real=6) # truncate_real specifies which portion of a W-plus vector to modify # (e.g., first 6 our of 14 vectors) ">
import torch
from lib.utils import FlowFactory, LatentFlow
from torchdiffeq import odeint_adjoint as odeint
device = torch.device("cuda")
flow_factory = FlowFactory(dataset="ffhq", device=device)
odeblock = flow_factory._build_odeblock(depth=1)
# depth = -1 corresponds to a constant right hand side (w' = c)
# depth >= 1 corresponds to an MLP with depth layers
odeblock.load_state_dict(...)

# some style vector (generator.style(z))
w0 = ...

# You can directly call odeint
with torch.no_grad():
    odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device))

# Or utilize the wrapper 
flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald")
flow.flow(w=w0, t=1)

# To flow real images:
w = torch.load("inverses/actors.pt").to(device)
flow.flow(w, t=6, truncate_real=6)
# truncate_real specifies which portion of a W-plus vector to modify
# (e.g., first 6 our of 14 vectors)

A sample script to generate a movie is

CUDA_VISIBLE_DEVICES=0 python make_movie.py --attribute Bald --dataset ffhq

Examples

FFHQ

Bald Goatee Wavy_Hair Arched_Eyebrows
Bangs Young Blond_Hair Chubby

Places2

lush rugged fog

Citation

Coming soon.

Credits

Owner
Valentin Khrulkov
PhD student
Valentin Khrulkov
Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA

PCA for dimensionality reduction combined with Kmeans Goal The Goal of this notebook is to apply a dimensionality reduction on a big dataset in order

Arturo Ghinassi 0 Sep 17, 2022
MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

MVGCN MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks. Developer: Fu Hait

13 Dec 01, 2022
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

CLOCs is a novel Camera-LiDAR Object Candidates fusion network. It provides a low-complexity multi-modal fusion framework that improves the performance of single-modality detectors. CLOCs operates on

Su Pang 254 Dec 16, 2022
A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

THUDM 562 Dec 27, 2022
Simple object detection app with streamlit

object-detection-app Simple object detection app with streamlit. Upload an image and perform object detection. Adjust the confidence threshold to see

Robin Cole 68 Jan 02, 2023
내가 보려고 정리한 <프로그래밍 기초 Ⅰ> / organized for me

Programming-Basics 프로그래밍 기초 Ⅰ 아카이브 Do it! 점프 투 파이썬 주차 강의주제 비고 1주차 Syllabus 2주차 자료형 - 숫자형 3주차 자료형 - 문자열형 4주차 입력과 출력 5주차 제어문 - 조건문 if 6주차 제어문 - 반복문 whil

KIMMINSEO 1 Mar 07, 2022
Data from "HateCheck: Functional Tests for Hate Speech Detection Models" (Röttger et al., ACL 2021)

In this repo, you can find the data from our ACL 2021 paper "HateCheck: Functional Tests for Hate Speech Detection Models". "test_suite_cases.csv" con

Paul Röttger 43 Nov 11, 2022
UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation By Vladimir Iglovikov and Alexey Shvets Introduction TernausNet is

Vladimir Iglovikov 1k Dec 28, 2022
A knowledge base construction engine for richly formatted data

Fonduer is a Python package and framework for building knowledge base construction (KBC) applications from richly formatted data. Note that Fonduer is

HazyResearch 386 Dec 05, 2022
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

Microsoft 22 Sep 15, 2022
GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management

Bitcoin Realized Volatility Forecasting with GARCH and Multivariate LSTM Author: Chi Bui This Repository Repository Directory ├── README.md

Chi Bui 113 Dec 29, 2022
Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning This repository is the official implementation of CARE.

ChongjianGE 89 Dec 02, 2022
I-BERT: Integer-only BERT Quantization

I-BERT: Integer-only BERT Quantization HuggingFace Implementation I-BERT is also available in the master branch of HuggingFace! Visit the following li

Sehoon Kim 139 Dec 27, 2022
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
This is the repository for the NeurIPS-21 paper [Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels].

CGPN This is the repository for the NeurIPS-21 paper [Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels]. Req

10 Sep 12, 2022
Neural network chess engine trained on Gary Kasparov's games.

Neural Chess It's not the best chess engine, but it is a chess engine. Proof of concept neural network chess engine (feed-forward multi-layer perceptr

3 Jun 22, 2022
You Only 👀 One Sequence

You Only 👀 One Sequence TL;DR: We study the transferability of the vanilla ViT pre-trained on mid-sized ImageNet-1k to the more challenging COCO obje

Hust Visual Learning Team 666 Jan 03, 2023
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 03, 2022
Efficient Online Bayesian Inference for Neural Bandits

Efficient Online Bayesian Inference for Neural Bandits By Gerardo Durán-Martín, Aleyna Kara, and Kevin Murphy AISTATS 2022.

Probabilistic machine learning 49 Dec 27, 2022