Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

Overview

DiagonalGAN

Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation" (ICCV 2021)

Arxiv : link CVF : link

Contact

If you have any question,

e-mail : [email protected]

Abstract

One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.

Models9

Environment Settings

Python 3.6.7 +

Pytorch 1.5.0 +

Dataset

For faster training, we recommend .jpg file format.

Download Link: CelebA-HQ / AFHQ

Unzip the files and put the folder into the data directory (./data/Celeb/data1024 , ./data/afhq)

To process the data for multidomain Diagonal GAN, run

./data/Celeb/Celeb_proc.py 

After download the CelebA-HQ dataset to save males / females images in different folders.

We randomly selected 1000 images as validation set for each domain (1000 males / 1000 females).

Save validation files into ./data/Celeb/val/males , ./data/Celeb/val/females

Train

Train Basic Diagonal GAN

For full-resolution CelebA-HQ training,

python train.py --datapath ./data/Celeb/data1024 --sched --max_size 1024 --loss r1

For full-resolution AFHQ training,

python train.py --datapath ./data/afhq --sched --max_size 512 --loss r1

Train Multidomain Diagonal GAN

For training multidomain (Males/ Females) models, run

python train_multidomain.py --datapath ./data/Celeb/mult --sched --max_size 256

Train IDInvert Encoders on pre-trained Multidomain Diagonal GAN

For training IDInvert on pre-trained model,

python train_idinvert.py --ckpt $MODEL_PATH$ 

or you can download the pre-trained Multidomain model.

Save the model in ./checkpoint/train_mult/CelebAHQ_mult.model

and set $MODEL_PATH$ as above.

Additional latent code optimization ( for inference )

To further optimize the latent codes,

python train_idinvert_opt.py --ckpt $MODEL_PATH$ --enc_ckpt $ENC_MODEL_PATH$

MODEL_PATH is pre-trained multidomain model directory, and

ENC_MODEL_PATH is IDInvert encoder model directory.

You can download the pre-trained IDInvert encoder models.

We also provide optimized latent codes.

Pre-trained model Download

Pre-trained Diagonal GAN on 1024x1024 CelebA-HQ : Link save to ./checkpoint/train_basic

Pre-trained Diagonal GAN on 512x512 AFHQ : Link save to ./checkpoint/train_basic

Pre-trained Multidomain Diagonal GAN on 256x256 CelebA-HQ : Link save to ./checkpoint/train_mult

Pre-trained IDInvert Encoders on 256x256 CelebA-HQ : Link save to ./checkpoint/train_idinvert

Optimized latent codes : Link save to ./codes

Generate Images

To generate the images from the pre-trained model,

python generate.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, interpolation).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'interpolate', generate with random interpolation on target layer $TARGET$

also, we can choose style or content with setting $DOM$ with 'style' or 'content'

Generate Images on Inverted model

To generate the images from the pre-trained IDInvert,

python generate_idinvert.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, encode).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'encode', generate auto-encoder reconstructions

we can choose style or content with setting $DOM$ with 'style' or 'content'

To use additional optimized latent codes, activate --use_code

Examples

python generate.py --mode sample 

03_content_sample

8x8 resolution content

python generate.py --mode mixing --domain content --target_layer 2 3

03_content_mixing

High resolution style

python generate.py --mode mixing --domain style --target_layer 14 15 16 17

02_style_mixing

Use CLIP to represent video for Retrieval Task

A Straightforward Framework For Video Retrieval Using CLIP This repository contains the basic code for feature extraction and replication of results.

Jesus Andres Portillo Quintero 54 Dec 22, 2022
End-to-end machine learning project for rices detection

Basmatinet Welcome to this project folks ! Whether you like it or not this project is all about riiiiice or riz in french. It is also about Deep Learn

Béranger 47 Jun 18, 2022
The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023
DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors By Anargyros Chatzitofis, Dimitris Zarpalas, Stefanos Kollias

tofis 24 Oct 08, 2022
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks

Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks This is the master thesi

Giacomo Arcieri 1 Mar 21, 2022
Graph WaveNet apdapted for brain connectivity analysis.

Graph WaveNet for brain network analysis This is the implementation of the Graph WaveNet model used in our manuscript: S. Wein , A. Schüller, A. M. To

4 Dec 17, 2022
ML-based medical imaging using Azure

Disclaimer This code is provided for research and development use only. This code is not intended for use in clinical decision-making or for any other

Microsoft Azure 68 Dec 23, 2022
A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial.

Streamlit Demo: Deep Dream A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial How to run this de

Streamlit 11 Dec 12, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
Code for weakly supervised segmentation of a single class

SingleClassRL Implementation of weak single object segmentation from paper "Regularized Loss for Weakly Supervised Single Class Semantic Segmentation"

16 Nov 14, 2022
Implementation of SSMF: Shifting Seasonal Matrix Factorization

SSMF Implementation of SSMF: Shifting Seasonal Matrix Factorization, Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi. NeurIPS, 2021

Koki Kawabata 9 Jun 10, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

Elad Hoffer 145 Sep 16, 2022
A framework for GPU based high-performance medical image processing and visualization

FAST is an open-source cross-platform framework with the main goal of making it easier to do high-performance processing and visualization of medical images on heterogeneous systems utilizing both mu

Erik Smistad 315 Dec 30, 2022
Franka Emika Panda manipulator kinematics&dynamics simulation

pybullet_sim_panda Pybullet simulation environment for Franka Emika Panda Dependency pybullet, numpy, spatial_math_mini Simple example (please check s

0 Jan 20, 2022
gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images

gACSON gACSON software is to visualize, segment, and analyze the morphology of neurons in three-dimensional electron microscopy images. If you use any

Andrea Behanova 2 May 31, 2022
CTF challenges from redpwnCTF 2021

redpwnCTF 2021 Challenges This repository contains challenges from redpwnCTF 2021 in the rCDS format; challenge information is in the challenge.yaml f

redpwn 27 Dec 07, 2022
This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression This repository contains the code for the paper in EM

Chenhe Dong 2 Mar 24, 2022
DEMix Layers for Modular Language Modeling

DEMix This repository contains modeling utilities for "DEMix Layers: Disentangling Domains for Modular Language Modeling" (Gururangan et. al, 2021). T

Suchin 43 Nov 11, 2022