Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

Overview

CLIP-GLaSS

Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

An in-browser demo is available here

Installation

Clone this repository

git clone https://github.com/galatolofederico/clip-glass && cd clip-glass

Create a virtual environment and install the requirements

virtualenv --python=python3.6 env && . ./env/bin/activate
pip install -r requirements.txt

Run CLIP-GLaSS

You can run CLIP-GLaSS with:

python run.py --config  --target 

Specifying and according to the following table:

Config Meaning Target Type
GPT2 Use GPT2 to solve the Image-to-Text task Image
DeepMindBigGAN512 Use DeepMind's BigGAN 512x512 to solve the Text-to-Image task Text
DeepMindBigGAN256 Use DeepMind's BigGAN 256x256 to solve the Text-to-Image task Text
StyleGAN2_ffhq_d Use StyleGAN2-ffhq to solve the Text-to-Image task Text
StyleGAN2_ffhq_nod Use StyleGAN2-ffhq without Discriminator to solve the Text-to-Image task Text
StyleGAN2_church_d Use StyleGAN2-church to solve the Text-to-Image task Text
StyleGAN2_church_nod Use StyleGAN2-church without Discriminator to solve the Text-to-Image task Text
StyleGAN2_car_d Use StyleGAN2-car to solve the Text-to-Image task Text
StyleGAN2_car_nod Use StyleGAN2-car without Discriminator to solve the Text-to-Image task Text

If you do not have downloaded the models weights you will be prompted to run ./download-weights.sh You will find the results in the folder ./tmp, a different output folder can be specified with --tmp-folder

Examples

python run.py --config StyleGAN2_ffhq_d --target "the face of a man with brown eyes and stubble beard"
python run.py --config GPT2 --target gpt2_images/dog.jpeg

Acknowledgments and licensing

This work heavily relies on the following amazing repositories and would have not been possible without them:

All their work can be shared under the terms of the respective original licenses.

All my original work (everything except the content of the folders clip, stylegan2 and gpt2) is released under the terms of the GNU/GPLv3 license. Coping, adapting e republishing it is not only consent but also encouraged.

Citing

If you want to cite use you can use this BibTeX

@article{galatolo_glass
,	author	= {Galatolo, Federico A and Cimino, Mario GCA and Vaglini, Gigliola}
,	title	= {Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search}
,	year	= {2021}
}

Contacts

For any further question feel free to reach me at [email protected] or on Telegram @galatolo

Owner
Federico Galatolo
PhD Student @ University of Pisa
Federico Galatolo
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
Unsupervised Feature Ranking via Attribute Networks.

FRANe Unsupervised Feature Ranking via Attribute Networks (FRANe) converts a dataset into a network (graph) with nodes that correspond to the features

7 Sep 29, 2022
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

CVSM Group - email: <a href=[email protected]"> 188 Dec 12, 2022
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

Neural Material Official code repository for the paper: Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021] Henzler, Deschai

Philipp Henzler 80 Dec 20, 2022
Docker containers of baseline agents for the Crafter environment

Crafter Baselines This repository contains Docker containers for running various baselines on the Crafter environment. Reward Agents DreamerV2 based o

Danijar Hafner 17 Sep 25, 2022
Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

128 Dec 27, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

merve 4 Sep 05, 2022
*ObjDetApp* deploys a pytorch model for object detection

*ObjDetApp* deploys a pytorch model for object detection

Will Chao 1 Dec 26, 2021
This repository contains the implementation of the following paper: Cross-Descriptor Visual Localization and Mapping

Cross-Descriptor Visual Localization and Mapping This repository contains the implementation of the following paper: "Cross-Descriptor Visual Localiza

Mihai Dusmanu 81 Oct 06, 2022
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 21 Nov 23, 2022
R-package accompanying the paper "Dynamic Factor Model for Functional Time Series: Identification, Estimation, and Prediction"

dffm The goal of dffm is to provide functionality to apply the methods developed in the paper “Dynamic Factor Model for Functional Time Series: Identi

Sven Otto 3 Dec 09, 2022
A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

Yinqiong Cai 189 Dec 28, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022
OpenLT: An open-source project for long-tail classification

OpenLT: An open-source project for long-tail classification Supported Methods for Long-tailed Recognition: Cross-Entropy Loss Focal Loss (ICCV'17) Cla

Ming Li 37 Sep 15, 2022
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
Tilted Empirical Risk Minimization (ICLR '21)

Tilted Empirical Risk Minimization This repository contains the implementation for the paper Tilted Empirical Risk Minimization ICLR 2021 Empirical ri

Tian Li 40 Nov 28, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Gradient representations in ReLU networks as similarity functions

Gradient representations in ReLU networks as similarity functions by Dániel Rácz and Bálint Daróczy. This repo contains the python code related to our

1 Oct 08, 2021