StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Related tags

Deep LearningStackGAN
Overview

StackGAN

Tensorflow implementation for reproducing main results in the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

TensorFlow 0.12

[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.

[Optional] skip-thought is needed, if use the skip-thought text encoder.

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • prettytensor
  • progressbar
  • python-dateutil
  • easydict
  • pandas
  • torchfile

Data

  1. Download our preprocessed char-CNN-RNN text embeddings for birds and flowers and save them to Data/.
  • [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
  1. Download the birds and flowers image data. Extract them to Data/birds/ and Data/flowers/, respectively.
  2. Preprocess images.
  • For birds: python misc/preprocess_birds.py
  • For flowers: python misc/preprocess_flowers.py

Training

  • The steps to train a StackGAN model on the CUB dataset using our preprocessed data for birds.
    • Step 1: train Stage-I GAN (e.g., for 600 epochs) python stageI/run_exp.py --cfg stageI/cfg/birds.yml --gpu 0
    • Step 2: train Stage-II GAN (e.g., for another 600 epochs) python stageII/run_exp.py --cfg stageII/cfg/birds.yml --gpu 1
  • Change birds.yml to flowers.yml to train a StackGAN model on Oxford-102 dataset using our preprocessed data for flowers.
  • *.yml files are example configuration files for training/testing our models.
  • If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

  • StackGAN for birds trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for flowers trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for birds trained from skip-thought text embeddings. Download and save it to models/ (Just used the same setting as the char-CNN-RNN. We assume better results can be achieved by playing with the hyper-parameters).

Run Demos

  • Run sh demo/flowers_demo.sh to generate flower samples from sentences. The results will be saved to Data/flowers/example_captions/. (Need to download the char-CNN-RNN text encoder for flowers to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run sh demo/birds_demo.sh to generate bird samples from sentences. The results will be saved to Data/birds/example_captions/.(Need to download the char-CNN-RNN text encoder for birds to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run python demo/birds_skip_thought_demo.py --cfg demo/cfg/birds-skip-thought-demo.yml --gpu 2 to generate bird samples from sentences. The results will be saved to Data/birds/example_captions-skip-thought/. (Need to download vocabulary for skip-thought vectors to Data/skipthoughts/).

Examples for birds (char-CNN-RNN embeddings), more on youtube:

Examples for flowers (char-CNN-RNN embeddings), more on youtube:

Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same discription 😃

Citing StackGAN

If you find StackGAN useful in your research, please consider citing:

@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

References

  • Generative Adversarial Text-to-Image Synthesis Paper Code
  • Learning Deep Representations of Fine-grained Visual Descriptions Paper Code
Owner
Han Zhang
Han Zhang
A 35mm camera, based on the Canonet G-III QL17 rangefinder, simulated in Python.

c is for Camera A 35mm camera, based on the Canonet G-III QL17 rangefinder, simulated in Python. The purpose of this project is to explore and underst

Daniele Procida 146 Sep 26, 2022
NeuralForecast is a Python library for time series forecasting with deep learning models

NeuralForecast is a Python library for time series forecasting with deep learning models. It includes benchmark datasets, data-loading utilities, evaluation functions, statistical tests, univariate m

Nixtla 1.1k Jan 03, 2023
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

Miles Zhang 54 Dec 21, 2022
Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"

AmbientGAN: Generative models from lossy measurements This repository provides code to reproduce results from the paper AmbientGAN: Generative models

Ashish Bora 87 Oct 19, 2022
Layer 7 DDoS Panel with Cloudflare Bypass ( UAM, CAPTCHA, BFM, etc.. )

Blood Deluxe DDoS DDoS Attack Panel includes CloudFlare Bypass (UAM, CAPTCHA, BFM, etc..)(It works intermittently. Working on it) Don't attack any web

272 Nov 01, 2022
CenterPoint 3D Object Detection and Tracking using center points in the bird-eye view.

CenterPoint 3D Object Detection and Tracking using center points in the bird-eye view. Center-based 3D Object Detection and Tracking, Tianwei Yin, Xin

Tianwei Yin 134 Dec 23, 2022
Code for paper "Learning to Reweight Examples for Robust Deep Learning"

learning-to-reweight-examples Code for paper Learning to Reweight Examples for Robust Deep Learning. [arxiv] Environment We tested the code on tensorf

Uber Research 261 Jan 01, 2023
Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.

ConvNeXt-TF This repository provides TensorFlow / Keras implementations of different ConvNeXt [1] variants. It also provides the TensorFlow / Keras mo

Sayak Paul 87 Dec 06, 2022
Consensus Learning from Heterogeneous Objectives for One-Class Collaborative Filtering

Consensus Learning from Heterogeneous Objectives for One-Class Collaborative Filtering This repository provides the source code of "Consensus Learning

SeongKu-Kang 6 Apr 29, 2022
To prepare an image processing model to classify the type of disaster based on the image dataset

Disaster Classificiation using CNNs bunnysaini/Disaster-Classificiation Goal To prepare an image processing model to classify the type of disaster bas

Bunny Saini 1 Jan 24, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 08, 2023
This is the source code of the solver used to compete in the International Timetabling Competition 2019.

ITC2019 Solver This is the source code of the solver used to compete in the International Timetabling Competition 2019. Building .NET Core (2.1 or hig

Edon Gashi 8 Jan 22, 2022
Official code for paper Exemplar Based 3D Portrait Stylization.

3D-Portrait-Stylization This is the official code for the paper "Exemplar Based 3D Portrait Stylization". You can check the paper on our project websi

60 Dec 07, 2022
Very deep VAEs in JAX/Flax

Very Deep VAEs in JAX/Flax Implementation of the experiments in the paper Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on I

Jamie Townsend 42 Dec 12, 2022
Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Rainbow 🌈 An implementation of Rainbow DQN which outperforms the paper's (Hessel et al. 2017) results on 40% of tested games while using 20x less dat

Dominik Schmidt 31 Dec 21, 2022
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Under refactoring 10th place solution for Google Smartphone Decimeter Challenge at kaggle. Google Smartphone Decimeter Challenge Global Navigation Sat

12 Oct 25, 2022