FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Last update: Dec 31, 2022

Related tags

Deep Learning FuseDream

Overview

FuseDream

This repo contains code for our paper (paper link):

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

by Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su and Qiang Liu from UCSD and UT Austin.

Introduction

FuseDream uses pre-trained GANs (we support BigGAN-256 and BigGAN-512 for now) and CLIP to achieve high-fidelity text-to-image generation.

Requirements

Please use pip or conda to install the following packages: PyTorch==1.7.1, torchvision==0.8.2, lpips==0.1.4 and also the requirements from BigGAN.

Getting Started

We transformed the pre-trained weights of BigGAN from TFHub to PyTorch. To save your time, you can download the transformed BigGAN checkpoints from:

https://drive.google.com/drive/folders/1nJ3HmgYgeA9NZr-oU-enqbYeO7zBaANs?usp=sharing

Put the checkpoints into ./BigGAN_utils/weights/

Run the following command to generate images from text query:

python fusedream_generator.py --text 'YOUR TEXT' --seed YOUR_SEED

For example, to get an image of a blue dog:

python fusedream_generator.py --text 'A photo of a blue dog.' --seed 1234

The generated image will be stored in ./samples

Colab Notebook

For a quick test of FuseDream, we provide Colab notebooks for FuseDream(Single Image) and FuseDream-Composition(TODO). Have fun!

Citations

If you use the code, please cite:

@inproceedings{
brock2018large,
title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
author={Andrew Brock and Jeff Donahue and Karen Simonyan},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=B1xsqj09Fm},
}

and

@misc{
liu2021fusedream,
title={FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization}, 
author={Xingchao Liu and Chengyue Gong and Lemeng Wu and Shujian Zhang and Hao Su and Qiang Liu},
year={2021},
eprint={2112.01573},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Related tags

Overview

FuseDream

Introduction

Requirements

Getting Started

Colab Notebook

Citations

Owner

XCL

Code for our TKDE paper "Understanding WeChat User Preferences and “Wow” Diffusion"

I-SECRET: Importance-guided fundus image enhancement via semi-supervised contrastive constraining

Retina blood vessel segmentation with a convolutional neural network

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks.

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

BackgroundRemover lets you Remove Background from images and video with a simple command line interface

An open source object detection toolbox based on PyTorch

Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

AI Flow is an open source framework that bridges big data and artificial intelligence.

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

The BCNet related data and inference model.

Read and write layered TIFF ImageSourceData and ImageResources tags