NICE-GAN — Official PyTorch Implementation Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Last update: Nov 25, 2022

Overview

NICE-GAN — Official PyTorch Implementation [Project Page]

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Paper

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Abstract Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component.

Author

Runfa Chen, Wenbing Huang, Binghui Huang, Fuchun Sun, Bin Fang Tsinghua Robot Learning Lab

Citation

If you find this code useful for your research, please cite our paper:

@InProceedings{Chen_2020_CVPR,
author = {Chen, Runfa and Huang, Wenbing and Huang, Binghui and Sun, Fuchun and Fang, Bin},
title = {Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Usage

├── dataset
   └── YOUR_DATASET_NAME
       ├── trainA
           ├── xxx.jpg (name, format doesn't matter)
           ├── yyy.png
           └── ...
       ├── trainB
           ├── zzz.jpg
           ├── www.png
           └── ...
       ├── testA
           ├── aaa.jpg 
           ├── bbb.png
           └── ...
       └── testB
           ├── ccc.jpg 
           ├── ddd.png
           └── ...

Prerequisites

Python 3.6.9
Pytorch 1.1.0 and torchvision (https://pytorch.org/)
TensorboardX
Tensorflow (for tensorboard usage)
CUDA 10.0.130, CuDNN 7.3, and Ubuntu 16.04.

Train

> python main.py --dataset cat2dog

If the memory of gpu is not sufficient, set --light to True

Restoring from the previous checkpoint

> python main.py --dataset cat2dog --resume True

Previous checkpoint: dataset_params_latest.pt
If the memory of gpu is not sufficient, set --light to True
Trained models(set --light to True): Our previous checkpoint on cat2dog can be downloaded from https://drive.google.com/open?id=1gIA5yhkY71zasY_lXheNjYvphqwhN0Os

Test

> python main.py --dataset cat2dog --phase test

Metric

> python fid_kid.py testA fakeA --mmd-var

You can use gpu, set --gpu to the index of gpu, such as --gpu 0

Network

Comparison

User study

t-SNE

Heatmaps

Shared latent space

Acknowledgments

Our code is inspired by UGATIT-pytorch.

NICE-GAN — Official PyTorch Implementation Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Related tags

Overview

NICE-GAN — Official PyTorch Implementation [Project Page]

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Paper

Author

Citation

Usage

Prerequisites

Train

Restoring from the previous checkpoint

Test

Metric

Network

Comparison

User study

t-SNE

Heatmaps

Shared latent space

Acknowledgments

Owner

Runfa Chen

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Python scripts using the Mediapipe models for Halloween.

Data pipelines for both TensorFlow and PyTorch!

code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

Implementation of Fast Transformer in Pytorch

A mini lib that implements several useful functions binding to PyTorch in C++.

Jigsaw Rate Severity of Toxic Comments

Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

Flexible-Modal Face Anti-Spoofing: A Benchmark

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

Graph Attention Networks

An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

TensorFlow Implementation of "Show, Attend and Tell"

Official implementation of the method ContIG, for self-supervised learning from medical imaging with genomics

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Sum-Product Probabilistic Language

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers