Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Overview

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Mohamad Shahbazi, Martin Danelljan, Danda P. Paudel, Luc Van Gool
Paper: https://openreview.net/forum?id=7TZeCsNOUB_

Teaser image

Abstract

Class-conditioning offers a direct means of controlling a Generative Adversarial Network (GAN) based on a discrete input variable. While necessary in many applications, the additional information provided by the class labels could even be expected to benefit the training of the GAN itself. Contrary to this belief, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability. Motivated by this observation, we propose a training strategy for conditional GANs (cGANs) that effectively prevents the observed mode-collapse by leveraging unconditional learning. Our training strategy starts with an unconditional GAN and gradually injects conditional information into the generator and the objective function. The proposed method for training cGANs with limited data results not only in stable training but also in generating high-quality images, thanks to the early-stage exploitation of the shared information across classes. We analyze the aforementioned mode collapse problem in comprehensive experiments on four datasets. Our approach demonstrates outstanding results compared with state-of-the-art methods and established baselines.

Overview

  1. Requirements
  2. Getting Started
  3. Dataset Prepration
  4. Training
  5. Evaluation and Logging
  6. Contact
  7. How to Cite

Requirements

  • Linux and Windows are supported, but Linux is recommended for performance and compatibility reasons.
  • For the batch size of 64, we have used 4 NVIDIA GeForce RTX 2080 Ti GPUs (each having 11 GiB of memory).
  • 64-bit Python 3.7 and PyTorch 1.7.1. See https://pytorch.org/ for PyTorch installation instructions.
  • CUDA toolkit 11.0 or later. Use at least version 11.1 if running on RTX 3090. (Why is a separate CUDA toolkit installation required? See comments of this Github issue.)
  • Python libraries: pip install wandb click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3.
  • This project uses Weights and Biases for visualization and logging. In addition to installing W&B (included in the command above), you need to create a free account on W&B website. Then, you must login to your account in the command line using the command ‍‍‍wandb login (The login information will be asked after running the command).
  • Docker users: use the provided Dockerfile by StyleGAN2+ADA (./Dockerfile) to build an image with the required library dependencies.

The code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing Visual Studio Community Edition and adding it into PATH using "C:\Program Files (x86)\Microsoft Visual Studio\ \Community\VC\Auxiliary\Build\vcvars64.bat" .

Getting Started

The code for this project is based on the Pytorch implementation of StyleGAN2+ADA. Please first read the instructions provided for StyleGAN2+ADA. Here, we mainly provide the additional details required to use our method.

For a quick start, we have provided example scripts in ./scripts, as well as an example dataset (a tar file containing a subset of ImageNet Carnivores dataset used in the paper) in ./datasets. Note that the scripts do not include the command for activating python environments. Moreover, the paths for the dataset and output directories can be modified in the scripts based on your own setup.

The following command runs a script that extracts the tar file and creates a ZIP file in the same directory.

bash scripts/prepare_dataset_ImageNetCarnivores_20_100.sh

The ZIP file is later used for training and evaluation. For more details on how to use your custom datasets, see Dataset Prepration.

Following command runs a script that trains the model using our method with default hyper-parameters:

bash scripts/train_ImageNetCarnivores_20_100.sh

For more details on how to use your custom datasets, see Training

To calculate the evaluation metrics on a pretrained model, use the following command:

bash scripts/inference_metrics_ImageNetCarnivores_20_100.sh

Outputs from the training and inferenve commands are by default placed under out/, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.

Dataset Prepration

Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels.

Custom datasets can be created from a folder containing images (each sub-directory containing images of one class in case of multi-class datasets) using dataset_tool.py; Here is an example of how to convert the dataset folder to the desired ZIP file:

python dataset_tool.py --source=datasets/ImageNet_Carnivores_20_100 --dest=datasets/ImageNet_Carnivores_20_100.zip --transform=center-crop --width=128 --height=128

The above example reads the images from the image folder provided by --src, resizes the images to the sizes provided by --width and --height, and applys the transform center-crop to them. The resulting images along with the metadata (label information) are stored as a ZIP file determined by --dest. see python dataset_tool.py --help for more information. See StyleGAN2+ADA instructions for more details on specific datasets or Legacy TFRecords datasets .

The created ZIP file can be passed to the training and evaluation code using --data argument.

Training

Training new networks can be done using train.py. In order to perform the training using our method, the argument --cond should be set to 1, so that the training is done conditionally. In addition, the start and the end of the transition from unconditional to conditional training should be specified using the arguments t_start_kimg and --t_end_kimg. Here is an example training command:

python train.py --outdir=./out/ \
--data=datasets/ImageNet_Carnivores_20_100.zip \
--cond=1 --t_start_kimg=2000  --t_end_kimg=4000  \
--gpus=4 \
--cfg=auto --mirror=1 \
--metrics=fid50k_full,kid50k_full

See StyleGAN2+ADA instructions for more details on the arguments, configurations amd hyper-parammeters. Please refer to python train.py --help for the full list of arguments.

Note: Our code currently can be used only for unconditional or transitional training. For the original conditional training, you can use the original implementation StyleGAN2+ADA.

Evaluation and Logging

By default, train.py automatically computes FID for each network pickle exported during training. More metrics can be added to the argument --metrics (as a comma-seperated list). To monitor the training, you can inspect the log.txt an JSON files (e.g. metric-fid50k_full.jsonl for FID) saved in the ouput directory. Alternatively, you can inspect WandB or Tensorboard logs (By default, WandB creates the logs under the project name "Transitional-cGAN", which can be accessed in your account on the website).

When desired, the automatic computation can be disabled with --metrics=none to speed up the training slightly (3%–9%). Additional metrics can also be computed after the training:

# Previous training run: look up options automatically, save result to JSONL file.
python calc_metrics.py --metrics=pr50k3_full \
    --network=~/training-runs/00000-ffhq10k-res64-auto1/network-snapshot-000000.pkl

# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq.zip --mirror=1 \
    --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

The first example looks up the training configuration and performs the same operation as if --metrics=pr50k3_full had been specified during training. The second example downloads a pre-trained network pickle, in which case the values of --mirror and --data must be specified explicitly.

See StyleGAN2+ADA instructions for more details on the available metrics.

Contact

For any questions, suggestions, or issues with the code, please contact Mohamad Shahbazi at [email protected]

How to Cite

@inproceedings{
shahbazi2022collapse,
title={Collapse by Conditioning: Training Class-conditional {GAN}s with Limited Data},
author={Shahbazi, Mohamad and Danelljan, Martin and Pani Paudel, Danda and Van Gool, Luc},
booktitle={The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=7TZeCsNOUB_}
Owner
Mohamad Shahbazi
Ph.D. student at Computer Vision Lab, ETH Zurich || Interested in Machine Learning and its Applications in Computer Vision, NLP and Healthcare
Mohamad Shahbazi
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer

1 Feb 03, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving Code will be available soon. Motivation Architecture

Kai Chen 24 Apr 19, 2022
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

40 Dec 13, 2022
Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

18 Jun 28, 2022
Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

FRSKD Official implementation for Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation (CVPR-2021) Requirements Pytho

75 Dec 28, 2022
This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Elaborative Rehearsal for Zero-shot Action Recognition This is an official implementation of: Shizhe Chen and Dong Huang, Elaborative Rehearsal for Ze

DeLightCMU 26 Sep 24, 2022
VOGUE: Try-On by StyleGAN Interpolation Optimization

VOGUE is a StyleGAN interpolation optimization algorithm for photo-realistic try-on. Top: shirt try-on automatically synthesized by our method in two different examples.

Wei ZHANG 66 Dec 09, 2022
Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021)

Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021) Contact 0 Jan 11, 2022

A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 03, 2023
Algorithms for outlier, adversarial and drift detection

Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline d

Seldon 1.6k Dec 31, 2022
Hippocampal segmentation using the UNet network for each axis

Hipposeg Hippocampal segmentation using the UNet network for each axis, inspired by https://github.com/MICLab-Unicamp/e2dhipseg Red: False Positive Gr

Juan Carlos Aguirre Arango 0 Sep 02, 2021
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

WIDER-YOLO : Yüz Tespit Uygulaması Yap Wider-Yolo Kütüphanesinin Kullanımı 1. Wider Face Veri Setini İndir Train Dataset Val Dataset Test Dataset Not:

Kadir Nar 6 Aug 22, 2022
BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构

BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构。 文档地址:https://basecls.readthedocs.io 安装 安装环境 BaseCls 需要 Python = 3.6。 BaseCls 依赖 M

MEGVII Research 28 Dec 23, 2022
The code for replicating the experiments from the LFI in SSMs with Unknown Dynamics paper.

Likelihood-Free Inference in State-Space Models with Unknown Dynamics This package contains the codes required to run the experiments in the paper. Th

Alex Aushev 0 Dec 27, 2021
A set of tools for Namebase and HNS

HNS-TOOLS A set of tools for Namebase and HNS To install: pip install -r requirements.txt To run: py main.py My Namebase referral code: http://namebas

RunDavidMC 7 Apr 08, 2022
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning 🧩 Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022