Linear image-to-image translation

Overview

Linear (Un)supervised Image-to-Image Translation

Teaser image Examples for linear orthogonal transformations in PCA domain, learned without pairing supervision. Training time is about 1 minute.

This repository contains the official pytorch implementation of the following paper:

The Surprising Effectiveness of Linear Unsupervised Image-to-Image Translation
Eitan Richardson and Yair Weiss
https://arxiv.org/abs/2007.12568

Abstract: Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only succeed due to a strong locality bias, and they fail to learn very simple nonlocal transformations (e.g. mapping upside down faces to upright faces). When the locality bias is removed, the methods are too powerful and may fail to learn simple local transformations. In this paper we introduce linear encoder-decoder architectures for unsupervised image to image translation. We show that learning is much easier and faster with these architectures and yet the results are surprisingly effective. In particular, we show a number of local problems for which the results of the linear methods are comparable to those of state-of-the-art architectures but with a fraction of the training time, and a number of nonlocal problems for which the state-of-the-art fails while linear methods succeed.

TODO:

  • Code for reproducing the linear image-to-image translation results
  • Code for applying the linear transformation as regularization for deep unsupervisd image-to-image (based on ALAE)
  • Support for user-provided dataset (e.g. image folders)
  • Automatic detection of available GPU resources

Requirements

  • Pytorch (tested with pytorch 1.5.0)
  • faiss (tested with faiss 1.6.3 with GPU support)
  • OpenCV (used only for generating some of the synthetic transformations)

System Requirements

Both the PCA and the nearest-neighbors search in ICP are performed on GPU (using pytorch and faiss). A cuda-enabled GPU with at least 11 GB of RAM is recommended. Since the entire data is loaded to RAM (not in mini-batches), a lot of (CPU) RAM is required as well ...

Code structure

  • run_im2im.py: The main python script for training and testing the linear transformation
  • pca-linear-map.py: The main algorithm. Performs PCA for the two domains, resolves polarity ambiguity and learnes an orthogonal or unconstrained linear transformation. In the unpaired case, ICP iterations are used to find the best correspondence.
  • pca.py: Fast PCA using pytorch and the skewness-based polarity synchronization.
  • utils.py: Misc utils
  • data.py: Loading the dataset and applying the synthetic transformations

Preparing the datasets

The repository does not contain code for loading the datasets, however, the tested datasets were loaded in their standard format. Please download (or link) the datasets under datasets/CelebA, datasets/FFHQ and datasets/edges2shoes.

Learning a linear transformation

usage: run_im2im.py [--dataset {celeba,ffhq,shoes}]
                    [--resolution RESOLUTION]
                    [--a_transform {identity,rot90,vflip,edges,Canny-edges,colorize,super-res,inpaint}]
                    [--pairing {paired,matching,nonmatching,few-matches}]
                    [--matching {nn,cyc-nn}]
                    [--transform_type {orthogonal,linear}] [--n_iters N_ITERS]
                    [--n_components N_COMPONENTS] [--n_train N_TRAIN]
                    [--n_test N_TEST]

Results are saved into the results folder.

Command example for generating the colorization result in the above image (figure 9 in tha paper):

python3 run_im2im.py --dataset ffhq --resolution 128 --a_transform colorize --n_components 2000 --n_train 20000 --n_test 25
Loading matching data for ffhq - colorize ...
100%|██████████████████████████████████████████████████████████████████████████| 20000/20000 [00:04<00:00, 4549.19it/s]
100%|█████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 299.33it/s]
Learning orthogonal transformation in 2000 PCA dimensions...
Got 20000 samples in A and 20000 in B.
PCA A...
PCA B...
Synchronizing...
Using skew-based logic for 1399/2000 dimensions.
PCA representations:  (20000, 2000) (20000, 2000) took: 68.09504985809326
Learning orthogonal transformation using matching sets:
Iter 0: 4191 B-NNs / 1210 consistent, mean NN l2 = 1308.520. took 2.88 sec.
Iter 1: 19634 B-NNs / 19634 consistent, mean NN l2 = 607.715. took 3.46 sec.
Iter 2: 19801 B-NNs / 19801 consistent, mean NN l2 = 204.487. took 3.49 sec.
Iter 3: 19801 B-NNs / 19801 consistent, mean NN l2 = 204.079. Converged - terminating ICP iterations.
Applying the learned transformation on test data...

Limitations

As described in the paper:

  • If the true translation is very non-linear, the learned linear transformation will not model it well.
  • If the image domain has a very complex structure, a large number of PCA coefficients will be required to achieve high quality reconstruction.
  • The nonmatching case (i.e. no matching paires exist) requires larger training sets.

Additional results

Paired

In the two examples above (edge images to real images and inpainting with a relative large part of the image missing), the true transformation is quite nonlinear, making the learned linear transformation less suitable. Here we used the unconstrained linear transformation rather than the orthogonal one. In addition, pairing supervision was used.

NonFaces

Here is an example showing the linear transformation method applied to a different domain (not just aligned faces).

Owner
Eitan Richardson
PhD student and TA at the Hebrew University of Jerusalem / Research Intern at Google
Eitan Richardson
Automatic Image Background Subtraction

Automatic Image Background Subtraction This repo contains set of scripts for automatic one-shot image background subtraction task using the following

Oleg Sémery 6 Dec 05, 2022
Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

Hasib Zunair 9 Jun 08, 2022
Pytorch Implementation of paper "Noisy Natural Gradient as Variational Inference"

Noisy Natural Gradient as Variational Inference PyTorch implementation of Noisy Natural Gradient as Variational Inference. Requirements Python 3 Pytor

Tony JiHyun Kim 119 Dec 02, 2022
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
Fortuitous Forgetting in Connectionist Networks

Fortuitous Forgetting in Connectionist Networks Introduction This repository includes reference code for the paper Fortuitous Forgetting in Connection

Hattie Zhou 14 Nov 26, 2022
Learning Tracking Representations via Dual-Branch Fully Transformer Networks

Learning Tracking Representations via Dual-Branch Fully Transformer Networks DualTFR ⭐ We achieves the runner-ups for both VOT2021ST (short-term) and

phiphi 19 May 04, 2022
Social Distancing Detector

Computer vision has opened up a lot of opportunities to explore into AI domain that were earlier highly limited. Here is an application of haarcascade classifier and OpenCV to develop a social distan

Ashish Pandey 2 Jul 18, 2022
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

YITUTech 1k Dec 31, 2022
HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

Lv Jun 113 Jan 06, 2023
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
S2s2net - Sentinel-2 Super-Resolution Segmentation Network

S2S2Net Sentinel-2 Super-Resolution Segmentation Network Getting started Install

Wei Ji 10 Nov 10, 2022
The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

EMANet News The bug in loading the pretrained model is now fixed. I have updated the .pth. To use it, download it again. EMANet-101 gets 80.99 on the

Xia Li 李夏 663 Nov 30, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
TensorFlow implementation of "Variational Inference with Normalizing Flows"

[TensorFlow 2] Variational Inference with Normalizing Flows TensorFlow implementation of "Variational Inference with Normalizing Flows" [1] Concept Co

YeongHyeon Park 7 Jun 08, 2022
This is the official code of our paper "Diversity-based Trajectory and Goal Selection with Hindsight Experience Relay" (PRICAI 2021)

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay This is the official implementation of our paper "Diversity-based Traje

Tianhong Dai 6 Jul 18, 2022
thundernet ncnn

MMDetection_Lite 基于mmdetection 实现一些轻量级检测模型,安装方式和mmdeteciton相同 voc0712 voc 0712训练 voc2007测试 coco预训练 thundernet_voc_shufflenetv2_1.5 input shape mAP 320

DayBreak 39 Dec 05, 2022
PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

PointCNN: Convolution On X-Transformed Points Created by Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Introduction PointCNN

Yangyan Li 1.3k Dec 21, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a

Sunbow Liu 10 Nov 25, 2022
Machine Translation Implement By Bi-GRU And Transformer

Seq2Seq Translation Implement By Bidirectional GRU And Transformer In Pytorch Before You Run The Code You should download the data through the link be

He Wang 2 Oct 27, 2021