Implements the training, testing and editing tools for "Pluralistic Image Completion"

Overview

Pluralistic Image Completion

ArXiv | Project Page | Online Demo | Video(demo)

This repository implements the training, testing and editing tools for "Pluralistic Image Completion" by Chuanxia Zheng, Tat-Jen Cham and Jianfei Cai at NTU. Given one masked image, the proposed Pluralistic model is able to generate multiple and diverse plausible results with various structure, color and texture.

Editing example

Example results

Example completion results of our method on images of face (CelebA), building (Paris), and natural scenes (Places2) with center masks (masks shown in gray). For each group, the masked input image is shown left, followed by sampled results from our model without any post-processing. The results are diverse and plusible.

More results on project page

Getting started

Installation

This code was tested with Pytoch 0.4.0, CUDA 9.0, Python 3.6 and Ubuntu 16.04

pip install visdom dominate
  • Clone this repo:
git clone https://github.com/lyndonzheng/Pluralistic
cd Pluralistic

Datasets

  • face dataset: 24183 training images and 2824 test images from CelebA and use the algorithm of Growing GANs to get the high-resolution CelebA-HQ dataset
  • building dataset: 14900 training images and 100 test images from Paris
  • natural scenery: original training and val images from Places2
  • object original training images from ImageNet.

Training

  • Train a model (default: random irregular and irregular holes):
python train.py --name celeba_random --img_file your_image_path
  • Set --mask_type in options/base_options.py for different training masks. --mask_file path is needed for external irregular mask, such as the irregular mask dataset provided by Liu et al. and Karim lskakov .
  • To view training results and loss plots, run python -m visdom.server and copy the URL http://localhost:8097.
  • Training models will be saved under the checkpoints folder.
  • The more training options can be found in options folder.

Testing

  • Test the model
python test.py  --name celeba_random --img_file your_image_path
  • Set --mask_type in options/base_options.py to test various masks. --mask_file path is needed for 3. external irregular mask,
  • The default results will be saved under the results folder. Set --results_dir for a new path to save the result.

Pretrained Models

Download the pre-trained models using the following links and put them undercheckpoints/ directory.

Our main novelty of this project is the multiple and diverse plausible results for one given masked image. The center_mask models are trained with images of resolution 256*256 with center holes 128x128, which have large diversity for the large missing information. The random_mask models are trained with random regular and irregular holes, which have different diversity for different mask sizes and image backgrounds.

GUI

Download the pre-trained models from Google drive and put them undercheckpoints/ directory.

  • Install the PyQt5 for GUI operation
pip install PyQt5

Basic usage is:

python -m visdom.server
python ui_main.py

The buttons in GUI:

  • Options: Select the model and corresponding dataset for editing.
  • Bush Width: Modify the width of bush for free_form mask.
  • draw/clear: Draw a free_form or rectangle mask for random_model. Clear all mask region for a new input.
  • load: Choose the image from the directory.
  • random: Random load the editing image from the datasets.
  • fill: Fill the holes ranges and show it on the right.
  • save: Save the inputs and outputs to the directory.
  • Original/Output: Switch to show the original or output image.

The steps are as follows:

1. Select a model from 'options'
2. Click the 'random' or 'load' button to get an input image.
3. If you choose a random model, click the 'draw/clear' button to input free_form mask.
4. If you choose a center model, the center mask has been given.
5. click 'fill' button to get multiple results.
6. click 'save' button to save the results.

Editing Example Results

  • Results (original, input, output) for object removing
  • Results (input, output) for face playing. When mask half or right face, the diversity will be small for the short+long term attention layer will copy information from other side. When mask top or down face, the diversity will be large.

Next

  • Free form mask for various Datasets
  • Higher resolution image completion

License


This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Citation

If you use this code for your research, please cite our paper.

@inproceedings{zheng2019pluralistic,
  title={Pluralistic Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1438--1447},
  year={2019}
}

@article{zheng2021pluralistic,
  title={Pluralistic Free-From Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  journal={International Journal of Computer Vision},
  pages={1--20},
  year={2021},
  publisher={Springer}
}
Owner
Chuanxia Zheng
Chuanxia Zheng
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model a

Google 3.2k Dec 31, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 07, 2023
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 05, 2022
Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Deep learning based model for Cyro ET Sub-tomogram-Detection High degree of stru

Siddhant Kumar 2 Feb 04, 2022
paper list in the area of reinforcenment learning for recommendation systems

paper list in the area of reinforcenment learning for recommendation systems

HenryZhao 23 Jun 09, 2022
OCR Post Correction for Endangered Language Texts

📌 Coming soon: an update to the software including features from our paper on semi-supervised OCR post-correction, to be published in the Transaction

Shruti Rijhwani 96 Dec 31, 2022
Boostcamp CV Serving For Python

Boostcamp-CV-Serving Prerequisites MySQL GCP Cloud Storage GCP key file Sentry Streamlit Cloud Secrets: .streamlit/secrets.toml #DO NOT SHARE THIS I

Jungwon Seo 19 Feb 22, 2022
DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

Google Research 845 Jan 04, 2023
This is a simple framework to make object detection dataset very quickly

FastAnnotation Table of contents General info Requirements Setup General info This is a simple framework to make object detection dataset very quickly

Serena Tetart 1 Jan 24, 2022
Catbird is an open source paraphrase generation toolkit based on PyTorch.

Catbird is an open source paraphrase generation toolkit based on PyTorch. Quick Start Requirements and Installation The project is based on PyTorch 1.

Afonso Salgado de Sousa 5 Dec 15, 2022
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 5.2k Dec 31, 2022
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

1 Jan 23, 2022
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 334 Dec 23, 2022
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?

RaftMLP RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality? By Yuki Tatsunami and Masato Taki (Rikkyo University) [arxiv]

Okojo 20 Aug 31, 2022
A toy compiler that can convert Python scripts to pickle bytecode 🥒

Pickora 🐰 A small compiler that can convert Python scripts to pickle bytecode. Requirements Python 3.8+ No third-party modules are required. Usage us

ꌗᖘ꒒ꀤ꓄꒒ꀤꈤꍟ 68 Jan 04, 2023
Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

Katsuya Hyodo 6 May 15, 2022
Predicting the duration of arrival delays for commercial flights.

Flight Delay Prediction Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21%

Jordan Silke 1 Jan 11, 2022
Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Unified-EPT Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation. Installation Linux, CUDA=10.0,

29 Aug 23, 2022