Variational autoencoder for anime face reconstruction

Overview

VAE animeface

Variational autoencoder for anime face reconstruction

Introduction

This repository is an exploratory example to train a variational autoencoder to extract meaningful feature representations of anime girl face images.

The code architecture is mostly borrowed and modified from Yann Dubois's disentangling-vae repository. It has nice summarization and comparison of the different VAE model proposed recently.

Dataset

Anime Face Dataset contains 63,632 anime faces. (all rescaled to 64x64 in training)

https://raw.githubusercontent.com/Mckinsey666/Anime-Face-Dataset/master/test.jpg

Model

The model used is the one proposed in the paper Understanding disentangling in β-VAE, which is summarized below:

https://github.com/YannDubs/disentangling-vae/raw/master/doc/imgs/architecture.png

I used laplace as the target distribution to calculate the reconstruction loss. From Yann's code, it suggests that bernoulli would generally a better choice, but it looks it converge slowly in my case. (I didn't do a fair comparison to be conclusive)

Loss function used is β-VAEH from β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.

Result

Latent feature number is set to 20 (10 gaussian mean, 10 log gaussian variance). VAE model is trained for 100 epochs. All data is used for training, no validation and testing applied.

Face reconstruction

results/laplace_betaH_loss/test1_recons.png

results/laplace_betaH_loss/test2_recons.png

results/laplace_betaH_loss/test3_recons.png

Prior space traversal

Based on the face reconstruction result while traversing across the latent space, we may speculate the generative property of each latent as following:

  1. Hair shade
  2. Hair length
  3. Face orientation
  4. Hair color
  5. Face rotation
  6. Bangs, face color
  7. Hair glossiness
  8. Unclear
  9. Eye size & color
  10. Bangs

results/laplace_betaH_loss/test_prior_traversals.png

Original faces clustering

Original anime faces are clustered based on latent features (selected feature is either below 1% (left 5) or above 99% (right 5) among all data points, while the rest latent features are closeto each other). Visulization of the original images mostly confirms the speculation above.

results/laplace_betaH_loss/test_original_traversals.png

Latent feature diagnosis

Learned latent features are all close to standard normal distribution, and show minimum correlation.

results/laplace_betaH_loss/latent_diagnosis.png

Owner
Minzhe Zhang
Graduate student in UT Southwestern Medical Center. Bioinformatician. Computational biologist.
Minzhe Zhang
Dados coletados e programas desenvolvidos no processo de iniciação científica

Iniciacao_cientifica_FAPESP_2020-14845-6 Dados coletados e programas desenvolvidos no processo de iniciação científica Os arquivos .py são os programa

1 Jan 10, 2022
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

15 Nov 30, 2022
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image [Project Page] [Paper] [Supp. Mat.] Table of Contents License Description Fittin

Vassilis Choutas 1.3k Jan 07, 2023
Implements an infinite sum of poisson-weighted convolutions

An infinite sum of Poisson-weighted convolutions Kyle Cranmer, Aug 2018 If viewing on GitHub, this looks better with nbviewer: click here Consider a v

Kyle Cranmer 26 Dec 07, 2022
JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

JudeasRX Instructions Read the references given in the Theory and Notation section below Fire up the Jupyter Notebook judeas-rx.ipynb The notebook dra

Robert R. Tucci 19 Nov 07, 2022
Deep Learning Emotion decoding using EEG data from Autism individuals

Deep Learning Emotion decoding using EEG data from Autism individuals This repository includes the python and matlab codes using for processing EEG 2D

Juan Manuel Mayor Torres 12 Dec 08, 2022
Implementation of ConvMixer for "Patches Are All You Need? 🤷"

Patches Are All You Need? 🤷 This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?" by Asher

CMU Locus Lab 934 Jan 08, 2023
Implementation of Fast Transformer in Pytorch

Fast Transformer - Pytorch Implementation of Fast Transformer in Pytorch. This only work as an encoder. Yannic video AI Epiphany Install $ pip install

Phil Wang 167 Dec 27, 2022
This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Wide-Networks This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameteri

Karl Hajjar 0 Nov 02, 2021
[UNMAINTAINED] Automated machine learning for analytics & production

auto_ml Automated machine learning for production and analytics Installation pip install auto_ml Getting started from auto_ml import Predictor from au

Preston Parry 1.6k Jan 02, 2023
(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

ClassSR (CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic Paper Authors: Xiangtao Kong, Hengyuan

Xiangtao Kong 308 Jan 05, 2023
Swapping face using Face Mesh with TensorFlow Lite

Swapping face using Face Mesh with TensorFlow Lite

iwatake 17 Apr 26, 2022
(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation CVPR2021(oral) [arxiv] Requirements python3.7 pytorch==

W-zx-Y 85 Dec 07, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022
Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

Unsupervised Learning of Visual 3D Keypoints for Control [Project Website] [Paper] Boyuan Chen1, Pieter Abbeel1, Deepak Pathak2 1UC Berkeley 2Carnegie

Boyuan Chen 34 Jul 22, 2022
RodoSol-ALPR Dataset

RodoSol-ALPR Dataset This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Ro

Rayson Laroca 45 Dec 15, 2022
eXPeditious Data Transfer

xpdt: eXPeditious Data Transfer About xpdt is (yet another) language for defining data-types and generating code for serializing and deserializing the

Gianni Tedesco 3 Jan 06, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022
PyTorch implementation of Neural Dual Contouring.

NDC PyTorch implementation of Neural Dual Contouring. Citation We are still writing the paper while adding more improvements and applications. If you

Zhiqin Chen 140 Dec 26, 2022
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022