Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Last update: Nov 28, 2022

Related tags

Computer Vision PPE

Overview

PPE ✨

Repository for our CVPR'2022 paper:

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model. Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding. To appear in CVPR 2022.

Pytorch implementation is at here: zipengxuc/PPE-Pytorch.

Updates

24 Mar 2022: We update our arxiv-version paper.

30 Mar 2022: We have had some changes in releasing the code. Pytorch implementation is now at here: zipengxuc/PPE-Pytorch.

14 Apr 2022: Update our PaddlePaddle inference code in this repository.

To reproduce our results:

Setup:

Install CLIP:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Download pre-trained models:

The code relies on the PaddleGAN (PaddlePaddle implementation of StyleGAN2). Download the pre-trained StyleGAN2 generator from here.

We provided several pretrained PPE models on here.
Invert real images:

The mapper is trained on latent vectors, so it is necessary to invert images into latent space. To edit human face, StyleCLIP provides the CelebA-HQ that was inverted by e4e: test set.

Usage:

Please first put downloaded pretraiend models and data on ckpt folder.

Inference

In PaddlePaddle version, we only provide inference code to generate editing results:

python mapper/evaluate.py

Reference

@article{xu2022ppe,
author = {Zipeng Xu and Tianwei Lin and Hao Tang and Fu Li and Dongliang He and Nicu Sebe and Radu Timofte and Luc Van Gool and Errui Ding},
title = {Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model},
journal = {arXiv preprint arXiv:2111.13333},
year = {2021}
}

If you have any questions, please contact [email protected]. :)

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Related tags

Overview

PPE ✨

Updates

To reproduce our results:

Setup:

Usage:

Inference

Reference

Owner

Zipeng Xu

This repository contains codes on how to handle mouse event using OpenCV

Open Source Differentiable Computer Vision Library for PyTorch

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

A simple component to display annotated text in Streamlit apps.

Characterizing possible failure modes in physics-informed neural networks.

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Fast style transfer

Implementation of EAST scene text detector in Keras

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

A Joint Video and Image Encoder for End-to-End Retrieval

A real-time dolly zoom camera effect

Handwritten Character Recognition using CNN

Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Deep LearningImage Captcha 2

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Image Smoothing and Blurring Using OpenCV

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Related tags

Overview

PPE ✨

Updates

To reproduce our results:

Setup:

Usage:

Inference

Reference

Owner

Zipeng Xu

This repository contains codes on how to handle mouse event using OpenCV

Open Source Differentiable Computer Vision Library for PyTorch

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

A simple component to display annotated text in Streamlit apps.

Characterizing possible failure modes in physics-informed neural networks.

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Fast style transfer

Implementation of EAST scene text detector in Keras

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

A Joint Video and Image Encoder for End-to-End Retrieval

A real-time dolly zoom camera effect

Handwritten Character Recognition using CNN

Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Deep LearningImage Captcha 2

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Image Smoothing and Blurring Using OpenCV

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約