Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

Last update: Dec 26, 2022

Related tags

Overview

AttnGAN

Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).

Dependencies

python 2.7

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

python-dateutil
easydict
pandas
torchfile
nltk
scikit-image

Data

Download our preprocessed metadata for birds coco and save them to data/
Download the birds image data. Extract them to data/birds/
Download coco dataset and extract the images to data/coco/

Training

Pre-train DAMSM models:
- For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
- For coco dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1
Train AttnGAN models:
- For bird dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 2
- For coco dataset: python main.py --cfg cfg/coco_attn2.yml --gpu 3
*.yml files are example configuration files for training/evaluation our models.

Pretrained Model

DAMSM for bird. Download and save it to DAMSMencoders/
DAMSM for coco. Download and save it to DAMSMencoders/
AttnGAN for bird. Download and save it to models/
AttnGAN for coco. Download and save it to models/
AttnDCGAN for bird. Download and save it to models/
- This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework.

Sampling

Run python main.py --cfg cfg/eval_bird.yml --gpu 1 to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved to DAMSMencoders/.
Change the eval_*.yml files to generate images from other pre-trained models.
Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.

Validation

To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run python main.py --cfg cfg/eval_bird.yml --gpu 1
We compute inception score for models trained on birds using StackGAN-inception-model.
We compute inception score for models trained on coco using improved-gan/inception_score.

Examples generated by AttnGAN [Blog]

bird example	coco example

Creating an API

Evaluation code embedded into a callable containerized API is included in the eval\ folder.

Citing AttnGAN

If you find AttnGAN useful in your research, please consider citing:

@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}

Reference

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

Related tags

Overview

AttnGAN

Dependencies

Creating an API

Citing AttnGAN

Owner

Tao Xu

Code for Mesh Convolution Using a Learned Kernel Basis

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

CTF challenges from redpwnCTF 2021

scikit-learn: machine learning in Python

Preprocessed Datasets for our Multimodal NER paper

Code for Ditto: Building Digital Twins of Articulated Objects from Interaction

This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

Semantic Segmentation with Pytorch-Lightning

[ICLR 2021] Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

The code written during my Bachelor Thesis "Classification of Human Whole-Body Motion using Hidden Markov Models".

Non-Attentive-Tacotron - This is Pytorch Implementation of Google's Non-attentive Tacotron.

Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

The Official TensorFlow Implementation for SPatchGAN (ICCV2021)