On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Overview

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko

Overview

We replace linear shifts commonly used for image editing with a flow of a trainable Neural ODE in the latent space.

w' = NN(w; \theta)

The RHS of this Neural ODE is trained end-to-end using pre-trained attribute regressors by enforcing

  • change of the desired attribute;
  • invariance of remaining attributes.

Installation and usage

Data

Data required to use the code is available at this dropbox link (2.5Gb).

Path Description
data data hosted on Dropbox
  ├  models pretrained GAN models and attribute regressors
  ├  log pretrained nonlinear edits (Neural ODEs of depth 1) for a variety of attributes on CUB, FFHQ, Places2
  ├  data_to_rectify 100,000 precomputed pairs (w, R[G[w]]); i.e., style vectors and corresponding semantic attributes
  ├  configs parameters of StyleGAN 2 generators for each dataset (n_mlp, channel_width, etc)
    └  inverses precomputed inverses (elements of W-plus) for sample FFHQ images

To download and unpack the data run get_data.sh.

Training

We used torch 1.7 for training; however, the code should work for lower versions as well. An example training script to rectify all the attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--depth 1

For selected attributes:

CUDA_VISIBLE_DEVICES=0 python train_ode.py --dataset ffhq \
--nb-iter 5000 \
--alpha 8 \
--dir 4 8 15 16 23 32 \
--depth 1

Custom dataset

For training on a custom dataset, you have to provide

  • Generator and attribute regressor weights
  • a dictionary {dataset}_all.pt (stored in data_to_rectify). It has the form {"ws": ws, "labels" : labels} with ws being a torch.Tensor of size N x 512 and labels is a torch.Tensor of size N x D, with D being the number of semantic factors. labels should be constructed by evaluating the corresponding attribute regressor on synthetic images generator(ws[i]). It is used to sample batches for training.

Visualization

Please see explore.ipynb for example visualizations. lib.utils.py contains a utility wrapper useful for building and loading the Neural ODE models (FlowFactory).

Restoring from checkpoint

= 1 corresponds to an MLP with depth layers odeblock.load_state_dict(...) # some style vector (generator.style(z)) w0 = ... # You can directly call odeint with torch.no_grad(): odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device)) # Or utilize the wrapper flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald") flow.flow(w=w0, t=1) # To flow real images: w = torch.load("inverses/actors.pt").to(device) flow.flow(w, t=6, truncate_real=6) # truncate_real specifies which portion of a W-plus vector to modify # (e.g., first 6 our of 14 vectors) ">
import torch
from lib.utils import FlowFactory, LatentFlow
from torchdiffeq import odeint_adjoint as odeint
device = torch.device("cuda")
flow_factory = FlowFactory(dataset="ffhq", device=device)
odeblock = flow_factory._build_odeblock(depth=1)
# depth = -1 corresponds to a constant right hand side (w' = c)
# depth >= 1 corresponds to an MLP with depth layers
odeblock.load_state_dict(...)

# some style vector (generator.style(z))
w0 = ...

# You can directly call odeint
with torch.no_grad():
    odeint(odeblock.odefunc, w0, torch.FloatTensor([0, 1]).to(device))

# Or utilize the wrapper 
flow = LatentFlow(odefunc=odeblock.odefunc, device=device, name="Bald")
flow.flow(w=w0, t=1)

# To flow real images:
w = torch.load("inverses/actors.pt").to(device)
flow.flow(w, t=6, truncate_real=6)
# truncate_real specifies which portion of a W-plus vector to modify
# (e.g., first 6 our of 14 vectors)

A sample script to generate a movie is

CUDA_VISIBLE_DEVICES=0 python make_movie.py --attribute Bald --dataset ffhq

Examples

FFHQ

Bald Goatee Wavy_Hair Arched_Eyebrows
Bangs Young Blond_Hair Chubby

Places2

lush rugged fog

Citation

Coming soon.

Credits

Owner
Valentin Khrulkov
PhD student
Valentin Khrulkov
Playing around with FastAPI and streamlit to create a YoloV5 object detector

FastAPI-Streamlit-based-YoloV5-detector Playing around with FastAPI and streamlit to create a YoloV5 object detector It turns out that a User Interfac

2 Jan 20, 2022
Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models Description Recent research has shown that numerous human-interpretable

Luke Melas-Kyriazi 61 Oct 17, 2022
A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Overview This is a set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI. Make TFRecords To run t

8 Nov 01, 2022
Jigsaw Rate Severity of Toxic Comments

Jigsaw Rate Severity of Toxic Comments

Guanshuo Xu 66 Nov 30, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
Implementation of ViViT: A Video Vision Transformer

ViViT: A Video Vision Transformer Unofficial implementation of ViViT: A Video Vision Transformer. Notes: This is in WIP. Model 2 is implemented, Model

Rishikesh (ऋषिकेश) 297 Jan 06, 2023
Robotics with GPU computing

Robotics with GPU computing Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to imple

Shirokuma 625 Jan 07, 2023
Using Tensorflow Object Detection API to detect Waymo open dataset

Waymo-2D-Object-Detection Using Tensorflow Object Detection API to detect Waymo open dataset Result CenterNet Training Loss SSD ResNet Training Loss C

76 Dec 12, 2022
Predict bus arrival time using VertexAI and Nvidia's Jetson Nano

bus_prediction predict bus arrival time using VertexAI and Nvidia's Jetson Nano imagenet the command for imagenet.py look like this python3 /path/to/i

10 Dec 22, 2022
Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging Minimal implementation and experiments of "No-Transaction Band N

19 Jan 03, 2023
PyTorch implementation of "LayoutTransformer: Layout Generation and Completion with Self-attention"

PyTorch implementation of "LayoutTransformer: Layout Generation and Completion with Self-attention" to appear in ICCV 2021

Kamal Gupta 75 Dec 23, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
Deploy optimized transformer based models on Nvidia Triton server

🤗 Hugging Face Transformer submillisecond inference 🤯 and deployment on Nvidia Triton server Yes, you can perfom inference with transformer based mo

Lefebvre Sarrut Services 1.2k Jan 05, 2023
Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Min-Max Adversarial Attacks [Paper] [arXiv] [Video] [Slide] Adversarial Attack Generation Empowered by Min-Max Optimization Jingkang Wang, Tianyun Zha

Jingkang Wang 12 Nov 23, 2022
Code for ViTAS_Vision Transformer Architecture Search

Vision Transformer Architecture Search This repository open source the code for ViTAS: Vision Transformer Architecture Search. ViTAS aims to search fo

46 Dec 17, 2022
Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Implicit-Semantic-Response-Alignment Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation" Prerequisites pyt

4 Dec 19, 2022
Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

HOG Algorithm Implementation Description HOG (Histograms of Oriented Gradients) Algorithm is an algorithm aiming to realize object segmentation (edge

Leo Hsieh 2 Mar 12, 2022
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s

14 Nov 10, 2022