Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Last update: Nov 27, 2022

Overview

Minimal PyTorch implementation of Generative Latent Optimization

This is a reimplementation of the paper

Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam:
Optimizing the Latent Space of Generative Networks

I'm not one of the authors. I just reimplemented parts of the paper in PyTorch for learning about PyTorch and generative models. Also, I liked the idea in the paper and was surprised that the approach actually works.

Implementation of the Laplacian pyramid L1 loss is inspired by https://github.com/mtyka/laploss. DCGAN network architecture follows https://github.com/pytorch/examples/tree/master/dcgan.

Running the code

First, install the required packages. For example, in Anaconda, you can simple do

conda install pytorch torchvision -c pytorch
conda install scikit-learn tqdm plac python-lmdb pillow

Download the LSUN dataset (only the bedroom training images are used here) into $LSUN_DIR. Then, simply run:

python glo.py $LSUN_DIR

You can learn more about the settings by running python glo.py --help.

Results

Unless mentioned otherwise, results are shown from a run over only a subset of the data (100000 samples - can be specified via the -n argument). Optimization was performed for only 25 epochs. The images below show reconstructions from the optimized latent space.

Results with 100-dimensional representation space look quite good, similar to the results shown in Fig. 1 in the paper.

python glo.py $LSUN_DIR -o d100 -gpu -d 100 -n 100000

Training for more epochs and from the whole dataset will make the images even sharper. Here are results (with 100D latent space) from a longer run of 50 epochs on the full dataset.

python glo.py $LSUN_DIR -o d100_full -gpu -d 100 -e 50

I'm not sure how many pyramid levels the authors used for the Laplacian pyramid L1 loss (here, we use 3 levels, but more might be better ... or not). But these results seem close enough.

Results with 512-dimensional representation space:

python glo.py $LSUN_DIR -o d512 -gpu -d 512 -n 100000

One of the main contributions of the paper is the use of the Laplacian pyramid L1 loss. Lets see how it compares to reconstructions using a simple L2 loss, again from 100-d representation space:

python glo.py $LSUN_DIR -o d100_l2 -gpu -d 512 -n 100000 -l l2

Comparison to L2 reconstruction loss, 512-d representation space:

python glo.py $LSUN_DIR -o d512_l2 -gpu -d 512 -n 100000 -l l2

I observed that initialization of the latent vectors with PCA is very crucial. Below are results from (normally distributed) random latent vectors. After 25 epochs, loss is only 0.31 (when initializing from PCA, loss after only 1 epoch is already 0.23). Reconstructions look really blurry.

python glo.py $LSUN_DIR -o d100_rand -gpu -d 100 -n 100000 -i random -e 500

It gets better after 500 epochs, but still very slow convergence and the results are not as clear as with PCA initialization.

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Related tags

Overview

Minimal PyTorch implementation of Generative Latent Optimization

Running the code

Results

Owner

Thomas Neumann

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Learning Time-Critical Responses for Interactive Character Control

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

AdvStyle - Official PyTorch Implementation

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

It's A ML based Web Site build with python and Django to find the breed of the dog

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Retinal vessel segmentation based on GT-UNet

Interactive Image Generation via Generative Adversarial Networks

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

PyTorch trainer and model for Sequence Classification

Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

This repo contains code to reproduce all experiments in Equivariant Neural Rendering

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

A small library of 3D related utilities used in my research.

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Related tags

Overview

Minimal PyTorch implementation of Generative Latent Optimization

Running the code

Results

Owner

Thomas Neumann

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Learning Time-Critical Responses for Interactive Character Control

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

AdvStyle - Official PyTorch Implementation

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

It's A ML based Web Site build with python and Django to find the breed of the dog

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Retinal vessel segmentation based on GT-UNet

Interactive Image Generation via Generative Adversarial Networks

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

PyTorch trainer and model for Sequence Classification

Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

This repo contains code to reproduce all experiments in Equivariant Neural Rendering

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

A small library of 3D related utilities used in my research.

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,