Implements the training, testing and editing tools for "Pluralistic Image Completion"

Overview

Pluralistic Image Completion

ArXiv | Project Page | Online Demo | Video(demo)

This repository implements the training, testing and editing tools for "Pluralistic Image Completion" by Chuanxia Zheng, Tat-Jen Cham and Jianfei Cai at NTU. Given one masked image, the proposed Pluralistic model is able to generate multiple and diverse plausible results with various structure, color and texture.

Editing example

Example results

Example completion results of our method on images of face (CelebA), building (Paris), and natural scenes (Places2) with center masks (masks shown in gray). For each group, the masked input image is shown left, followed by sampled results from our model without any post-processing. The results are diverse and plusible.

More results on project page

Getting started

Installation

This code was tested with Pytoch 0.4.0, CUDA 9.0, Python 3.6 and Ubuntu 16.04

pip install visdom dominate
  • Clone this repo:
git clone https://github.com/lyndonzheng/Pluralistic
cd Pluralistic

Datasets

  • face dataset: 24183 training images and 2824 test images from CelebA and use the algorithm of Growing GANs to get the high-resolution CelebA-HQ dataset
  • building dataset: 14900 training images and 100 test images from Paris
  • natural scenery: original training and val images from Places2
  • object original training images from ImageNet.

Training

  • Train a model (default: random irregular and irregular holes):
python train.py --name celeba_random --img_file your_image_path
  • Set --mask_type in options/base_options.py for different training masks. --mask_file path is needed for external irregular mask, such as the irregular mask dataset provided by Liu et al. and Karim lskakov .
  • To view training results and loss plots, run python -m visdom.server and copy the URL http://localhost:8097.
  • Training models will be saved under the checkpoints folder.
  • The more training options can be found in options folder.

Testing

  • Test the model
python test.py  --name celeba_random --img_file your_image_path
  • Set --mask_type in options/base_options.py to test various masks. --mask_file path is needed for 3. external irregular mask,
  • The default results will be saved under the results folder. Set --results_dir for a new path to save the result.

Pretrained Models

Download the pre-trained models using the following links and put them undercheckpoints/ directory.

Our main novelty of this project is the multiple and diverse plausible results for one given masked image. The center_mask models are trained with images of resolution 256*256 with center holes 128x128, which have large diversity for the large missing information. The random_mask models are trained with random regular and irregular holes, which have different diversity for different mask sizes and image backgrounds.

GUI

Download the pre-trained models from Google drive and put them undercheckpoints/ directory.

  • Install the PyQt5 for GUI operation
pip install PyQt5

Basic usage is:

python -m visdom.server
python ui_main.py

The buttons in GUI:

  • Options: Select the model and corresponding dataset for editing.
  • Bush Width: Modify the width of bush for free_form mask.
  • draw/clear: Draw a free_form or rectangle mask for random_model. Clear all mask region for a new input.
  • load: Choose the image from the directory.
  • random: Random load the editing image from the datasets.
  • fill: Fill the holes ranges and show it on the right.
  • save: Save the inputs and outputs to the directory.
  • Original/Output: Switch to show the original or output image.

The steps are as follows:

1. Select a model from 'options'
2. Click the 'random' or 'load' button to get an input image.
3. If you choose a random model, click the 'draw/clear' button to input free_form mask.
4. If you choose a center model, the center mask has been given.
5. click 'fill' button to get multiple results.
6. click 'save' button to save the results.

Editing Example Results

  • Results (original, input, output) for object removing
  • Results (input, output) for face playing. When mask half or right face, the diversity will be small for the short+long term attention layer will copy information from other side. When mask top or down face, the diversity will be large.

Next

  • Free form mask for various Datasets
  • Higher resolution image completion

License


This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Citation

If you use this code for your research, please cite our paper.

@inproceedings{zheng2019pluralistic,
  title={Pluralistic Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1438--1447},
  year={2019}
}

@article{zheng2021pluralistic,
  title={Pluralistic Free-From Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  journal={International Journal of Computer Vision},
  pages={1--20},
  year={2021},
  publisher={Springer}
}
Owner
Chuanxia Zheng
Chuanxia Zheng
SAT Project - The first project I had done at General Assembly, performed EDA, data cleaning and created data visualizations

Project 1: Standardized Test Analysis by Adam Klesc Overview This project covers: Basic statistics and probability Many Python programming concepts Pr

Adam Muhammad Klesc 1 Jan 03, 2022
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

Yasunori Shimura 7 Jul 27, 2022
Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Rainbow 🌈 An implementation of Rainbow DQN which outperforms the paper's (Hessel et al. 2017) results on 40% of tested games while using 20x less dat

Dominik Schmidt 31 Dec 21, 2022
Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch

Enformer - Pytorch (wip) Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch. The original tensorflow

Phil Wang 235 Dec 27, 2022
Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

66 Dec 16, 2022
This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

41 Dec 14, 2022
Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

Ligeng Zhu 31 Oct 19, 2019
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 04, 2023
Pytorch implementation of FlowNet by Dosovitskiy et al.

FlowNetPytorch Pytorch implementation of FlowNet by Dosovitskiy et al. This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et

Clément Pinard 762 Jan 02, 2023
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

Yuanming Hu 719 Dec 29, 2022
Perturb-and-max-product: Sampling and learning in discrete energy-based models

Perturb-and-max-product: Sampling and learning in discrete energy-based models This repo contains code for reproducing the results in the paper Pertur

Vicarious 2 Mar 14, 2022
TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

Yunjey Choi 902 Nov 29, 2022
Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

Does-MAML-Only-Work-via-Feature-Re-use-A-Data-Set-Centric-Perspective Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective Installin

2 Nov 07, 2022
Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI)

Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI) Preparation Clone the Synchronized-BatchNorm-P

Fangneng Zhan 12 Aug 10, 2022
Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

6 Dec 29, 2021
Learning to trade under the reinforcement learning framework

Trading Using Q-Learning In this project, I will present an adaptive learning model to trade a single stock under the reinforcement learning framework

Uirá Caiado 470 Nov 28, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Clara Meister 50 Nov 12, 2022
Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

Real-ESRGAN Colab Demo for Real-ESRGAN . Portable Windows executable file. You can find more information here. Real-ESRGAN aims at developing Practica

Xintao 17.2k Jan 02, 2023