A simple, clean TensorFlow implementation of Generative Adversarial Networks with a focus on modeling illustrations.

Overview

IllustrationGAN

A simple, clean TensorFlow implementation of Generative Adversarial Networks with a focus on modeling illustrations.

Generated Images

These images were generated by the model after being trained on a custom dataset of about 20,000 anime faces that were automatically cropped from illustrations using a face detector. Generated Images

Checking for Overfitting

It is theoretically possible for the generator network to memorize training set images rather than actually generalizing and learning to produce novel images of its own. To check for this, I randomly generate images and display the "closest" images in the training set according to mean squared error. The top row is randomly generated images, the columns are the closest 5 images in the training set.

Overfitting Check

It is clear that the generator does not merely learn to copy training set images, but rather generalizes and is able to produce its own unique images.

How it Works

Generative Adversarial Networks consist of two neural networks: a discriminator and a generator. The discriminator receives both real images from the training set and generated images produced by the generator. The discriminator outputs the probability that an image is real, so it is trained to output high values for the real images and low values for the generated ones. The generator is trained to produce images that the discriminator thinks are real. Both the discriminator and generator are trainined simultaneously so that they compete against each other. As a result of this, the generator learns to produce more and more realistic images as it trains.

Model Architecture

The model is based on DCGANs, but with a few important differences:

  1. No strided convolutions. The generator uses bilinear upsampling to upscale a feature blob by a factor of 2, followed by a stride-1 convolution layer. The discriminator uses a stride-1 convolution followed by 2x2 max pooling.

  2. Minibatch discrimination. See Improved Techniques for Training GANs for more details.

  3. More fully connected layers in both the generator and discriminator. In DCGANs, both networks have only one fully connected layer.

  4. A novel regularization term applied to the generator network. Normally, increasing the number of fully connected layers in the generator beyond one triggers one of the most common failure modes when training GANs: the generator "collapses" the z-space and produces only a very small number of unique examples. In other words, very different z vectors will produce nearly the same generated image. To fix this, I add a small auxiliary z-predictor network that takes as input the output of the last fully connected layer in the generator, and predicts the value of z. In other words, it attempts to learn the inverse of whatever function the generator fully connected layers learn. The z-predictor network and generator are trained together to predict the value of z. This forces the generator fully connected layers to only learn those transformations that preserve information about z. The result is that the aformentioned collapse no longer occurs, and the generator is able to leverage the power of the additional fully connected layers.

Training the Model

Dependencies: TensorFlow, PrettyTensor, numpy, matplotlib

The custom dataset I used is too large to add to a Github repository; I am currently finding a suitable way to distribute it. Instructions for training the model will be in this readme after I make the dataset available.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

ISC-Track2-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 2. Required dependencies To begin with

Wenhao Wang 89 Jan 02, 2023
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

SpeechBrain 5.1k Jan 02, 2023
Genetic Programming in Python, with a scikit-learn inspired API

Welcome to gplearn! gplearn implements Genetic Programming in Python, with a scikit-learn inspired and compatible API. While Genetic Programming (GP)

Trevor Stephens 1.3k Jan 03, 2023
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision Download links and PyTorch implementation of "Towers of Ba

Blakey Wu 40 Dec 14, 2022
This repository contain code on Novelty-Driven Binary Particle Swarm Optimisation for Truss Optimisation Problems.

This repository contain code on Novelty-Driven Binary Particle Swarm Optimisation for Truss Optimisation Problems. The main directory include the code

0 Dec 23, 2021
CVPR2021 Workshop - HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization.

HDRUNet [Paper Link] HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization By Xiangyu Chen, Yihao Liu, Zhengwen Zhang, Yu Qiao an

XyChen 105 Dec 20, 2022
A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

Yue Tan 21 Nov 24, 2022
Code for the Lovász-Softmax loss (CVPR 2018)

The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks Maxim Berman, Amal Ranne

Maxim Berman 1.3k Jan 04, 2023
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Transformer Tracking (CVPR2021)

TransT - Transformer Tracking [CVPR2021] Official implementation of the TransT (CVPR2021) , including training code and trained models. We are revisin

chenxin 465 Jan 06, 2023
meProp: Sparsified Back Propagation for Accelerated Deep Learning

meProp The codes were used for the paper meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting (ICML 2017) [pdf]

LancoPKU 107 Nov 18, 2022
Boostcamp CV Serving For Python

Boostcamp-CV-Serving Prerequisites MySQL GCP Cloud Storage GCP key file Sentry Streamlit Cloud Secrets: .streamlit/secrets.toml #DO NOT SHARE THIS I

Jungwon Seo 19 Feb 22, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Fully convolutional deep neural network to remove transparent overlays from images

Fully convolutional deep neural network to remove transparent overlays from images

Marc Belmont 1.1k Jan 06, 2023
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022
ProjectOxford-ClientSDK - This repo has moved :house: Visit our website for the latest SDKs & Samples

This project has moved 🏠 We heard your feedback! This repo has been deprecated and each project has moved to a new home in a repo scoped by API and p

Microsoft 970 Nov 28, 2022
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
coldcuts is an R package to automatically generate and plot segmentation drawings in R

coldcuts coldcuts is an R package that allows you to draw and plot automatically segmentations from 3D voxel arrays. The name is inspired by one of It

2 Sep 03, 2022
Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

DeepMIH: Deep Invertible Network for Multiple Image Hiding (TPAMI 2022) This repo is the official code for DeepMIH: Deep Invertible Network for Multip

Junpeng Jing 67 Nov 22, 2022