My implementation of Image Inpainting - A deep learning Inpainting model

Overview

Image Inpainting

What is Image Inpainting

Image inpainting is a restorative process that allows for the fixing or removal of unwanted parts within images. Typically, this process is done by professionals who use software to change the image to remove the imperfection painstakingly. A deep learning approach bypasses manual labor typically used in this process and applies a neural network to determine the proper fill for the parts of the image.

Examples

To see a higher quality version, click on the images

From left to right: original, interpolated, predicted

alt text alt text

Reasearch and Development

The model architecture is created using a fully convolutional deep residual network. I had pretty good intuition that this type of model would work, as it had on my previous projects for image restoration. I looked into other architectures such as UNET for inpainting but ran into troubles while implementing them.

First, UNET requires you to splice images during inference, meaning that the image splice had to be larger than the white space that the user is trying to inpaint. For example, if the splices you set up for inference were set up to take 64x64 chunks of the image and you managed to get whitespace that fully engulfed this splice, feeding this into the model would result in improper pixels due to the model not having any reference. This would require a different architecture that would detect the size of the white space for images so that you could adequately select the image splice size.

The following architecture I looked into and tried implementing was a GAN (Generative Adversarial Network) based model. I've experimented with GANs and implemented a model that could generate faces using images from the CelebA dataset; however, using GANs for Inpainting proved a much more complex problem. There are issues that I faced with proper ratios of the loss functions being L1 loss and the adversarial loss of the discriminator. Although a GAN-based model would likely drastically improve the output during inference, I could not tune the hyper-parameters enough to balance both the loss functions and the training of the generator and discriminator.

I resolved to use the current architecture described due to its simplicity and relatively adequate results.

Model Architecture

Methods Depth Filters Parameters Training Time
Inpaint Model 50 (49 layers) 192-3 15,945k ~30hrs

Network Architecture:

How do you use this model?

Due to the sheer size of this model, I can't fully upload it onto GitHub. Instead, I have opted to upload it via Google Drive, where you should be able to download it. Place this download '.h5' file and place it inside the 'weights/' directory.

How can you train your own model?

The model is instantiated within network.py. You can play around with hyper-parameters there. First, to train the model, delete the images currently within data/ put your training image data within that file - any large dataset such as ImageNet or an equivalent should work. Finally, mess with hyper-parameters in train.py and run train.py. If you’re training on weaker hardware, I’d recommend lowering the batch_size below the currently set 4 images.

Qualitative Examples (click on the images for higher quality):

Set 5 Evaluation Set:

Images Left to Right: Original, Interpolated, Predicted alt text alt text alt text alt text

Hardware - Training Statistics

Trained on 3070 ti
Batch Size: 4
Training Image Size: 96x96

Author

Joshua Evans - github/JoshVEvans
Owner
Joshua V Evans
Computer Systems Engineering | Arizona State University '25 | Interested in creating intelligent machines
Joshua V Evans
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

Akari Asai 25 Oct 30, 2022
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs This is an implemetation of the paper Few-shot Relation Extraction via Baye

MilaGraph 36 Nov 22, 2022
TLDR: Twin Learning for Dimensionality Reduction

TLDR (Twin Learning for Dimensionality Reduction) is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self

NAVER 105 Dec 28, 2022
Code and data for ImageCoDe, a contextual vison-and-language benchmark

ImageCoDe This repository contains code and data for ImageCoDe: Image Retrieval from Contextual Descriptions. Data All collected descriptions for the

McGill NLP 27 Dec 02, 2022
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods."

pv_predict_unet-lstm Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods." IEEE Transactions

FolkScientistInDL 8 Oct 08, 2022
[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

254 Jan 02, 2023
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Zhen Dong 36 Dec 02, 2022
Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

NOTE We have noticed a lot of concern that PULSE will be used to identify individuals whose faces have been blurred out. We want to emphasize that thi

Denis Malimonov 2k Dec 29, 2022
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

2 Jan 24, 2022
LogAvgExp - Pytorch Implementation of LogAvgExp

LogAvgExp - Pytorch Implementation of LogAvgExp for Pytorch Install $ pip instal

Phil Wang 31 Oct 14, 2022
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Easy-to-use toolkit for retrieval-based Chatbot Recent Activity Our released RRS corpus can be found here. Our released BERT-FP post-training checkpoi

GMFTBY 32 Nov 13, 2022
Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D) Code & Data Appendix for Conjugated Discrete Distributions for Distr

1 Jan 11, 2022
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 46 Sep 20, 2022
CS50x-AI - Artificial Intelligence with Python from Harvard University

CS50x-AI Artificial Intelligence with Python from Harvard University 📖 Table of

Hosein Damavandi 6 Aug 22, 2022
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

Kanghyun Choi 21 Nov 03, 2022
A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

Guyue Hu 210 Dec 31, 2022