Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Overview

Face2webtoon

merge_from_ofoct (2)

merge_from_ofoct (1)

Introduction

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon 연애혁명 and tried to transfer human faces to webtoon domain.

Webtoon Dataset

data

I used anime face detector. Since face detector is not that good at detecting the faces from webtoon, I could gather only 1400 webtoon face images.

Baseline 0(U-GAT-IT)

I used U-GAT-IT official pytorch implementation. U-GAT-IT is GAN for unpaired image to image translation. By using CAM attention module and adaptive layer instance normalization, it performed well on image translation where considerable shape deformation is required, on various hyperparameter settings. Since shape is very different between two domain, I used this model.

For face data, i used AFAD-Lite dataset from https://github.com/afad-dataset/tarball-lite.

good

gif1

Some results look pretty nice, but many result have lost attributes while transfering.

Missing of Attributes

Gender

gender

Gender information was lost.

Glasses

glasses

A model failed to generate glasses in the webtoon faces.

Result Analysis

To analysis the result, I seperated webtoon dataset to 5 different groups.

group number group name number of data
0 woman_no_glasses 1050
1 man_no_glasses 249
2 man_glasses 17->49
3 woman_glasses 15->38

Even after I collected more data for group 2 and 3, there are severe imbalances between groups. As a result, model failed to translate to few shot groups, for example, group 2 and 3.

U-GAT-IT + Few Shot Transfer

Few shot transfer : https://arxiv.org/abs/2007.13332

Paper review : https://yun905.tistory.com/48

In this paper, authors successfully transfered the knowledge from group with enough data to few shot groups which have only 10~15 data. First, they trained basic model, and made branches for few shot groups.

Basic model

For basic model, I trained U-GAT-IT between only group 0.

basic_model1 basic_model2

Baseline 1 (simple fine-tuning)

For baseline 1, I freeze the bottleneck layers of generator and tried to fine-tune the basic model. I used 38 images(both real/fake) of group 1,2,3, and added 8 images of group 0 to prevent forgetting. I trained for 200k iterations.

1

Model randomly mapped between groups.

Baseline 2 (group classification loss + selective backprop)

0

I attached additional group classifier to discriminator and added group classification loss according to original paper. Images of group 0,1,2,3 were feeded sequentially, and bottleneck layers of generator were updated for group 0 only.

With limited data, bias of FID score is too big. Instead, I used KID

KID*1000
25.95

U-GAT-IT + group classification loss + adaptive discriminator augmentation

ADA is very useful data augmentation method for training GAN with limited data. Although original paper only handles unconditional GANs, I applied ADA to U-GAT-IT which is conditional GAN. Augmentation was applied to both discriminators, because it is expected that preventing the discriminator of the face domain from overfitting would improve the performance of the face generator and therefore the cycle consistency loss would be more meaningful. Only pixel blitting and geometric transformation have been implemented, as the effects of other augmentation methods are minimal according to paper. The rest will be implemented later.

To achieve better result, I changed face dataset to more diverse one(CelebA).

merge_from_ofoct (2)

merge_from_ofoct (1)

image

ADA makes training longer. It took 8 days with single 2070 SUPER, but did not converged completely.

KID*1000
12.14

Start training

python main.py --dataset dataset_name --useADA True --group 0,1,2,3 --use_grouploss True --neptune False

If --neptune is True, the experiment is transmitted to neptune ai, which is experiment management tool. You must set your API token. --group 0,1,3 make group 2 out of training.

Owner
이상윤
이상윤
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting

RATCHET: RAdiological Text Captioning for Human Examined Thoraxes RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting. Based on t

26 Nov 14, 2022
Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

Rishabh Anand 24 Mar 23, 2022
A PyTorch-centric hybrid classical-quantum machine learning framework

torchquantum A PyTorch-centric hybrid classical-quantum dynamic neural networks framework. News Add a simple example script using quantum gates to do

MIT HAN Lab 400 Jan 02, 2023
Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.

Think Bayes 2 by Allen B. Downey The HTML version of this book is here. Think Bayes is an introduction to Bayesian statistics using computational meth

Allen Downey 1.5k Jan 08, 2023
pixelNeRF: Neural Radiance Fields from One or Few Images

pixelNeRF: Neural Radiance Fields from One or Few Images Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa UC Berkeley arXiv: http://arxiv.org/abs/2

Alex Yu 1k Jan 04, 2023
Structured Data Gradient Pruning (SDGP)

Structured Data Gradient Pruning (SDGP) Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by re

Bradley McDanel 10 Nov 11, 2022
Facilitating Database Tuning with Hyper-ParameterOptimization: A Comprehensive Experimental Evaluation

A Comprehensive Experimental Evaluation for Database Configuration Tuning This is the source code to the paper "Facilitating Database Tuning with Hype

DAIR Lab 9 Oct 29, 2022
PyTorch implementation of MICCAI 2018 paper "Liver Lesion Detection from Weakly-labeled Multi-phase CT Volumes with a Grouped Single Shot MultiBox Detector"

Grouped SSD (GSSD) for liver lesion detection from multi-phase CT Note: the MICCAI 2018 paper only covers the multi-phase lesion detection part of thi

Sang-gil Lee 36 Oct 12, 2022
Code release for "Making a Bird AI Expert Work for You and Me".

Making-a-Bird-AI-Expert-Work-for-You-and-Me Code release for "Making a Bird AI Expert Work for You and Me". arxiv (Coming soon...) Changelog 2021/12/6

PRIS-CV: Computer Vision Group 11 Dec 11, 2022
A high-performance distributed deep learning system targeting large-scale and automated distributed training.

HETU Documentation | Examples Hetu is a high-performance distributed deep learning system targeting trillions of parameters DL model training, develop

DAIR Lab 150 Dec 21, 2022
Api's bulid in Flask perfom to manage Todo Task.

Citymall-task Api's bulid in Flask perfom to manage Todo Task. Installation Requrements : Python: 3.10.0 MongoDB create .env file with variables DB_UR

Aisha Tayyaba 1 Dec 17, 2021
The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

Rootroo Ltd 2 Dec 25, 2021
Pytorch code for "State-only Imitation with Transition Dynamics Mismatch" (ICLR 2020)

This repo contains code for our paper State-only Imitation with Transition Dynamics Mismatch published at ICLR 2020. The code heavily uses the RL mach

20 Sep 08, 2022
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).

GAM ⠀⠀ A PyTorch implementation of Graph Classification Using Structural Attention (KDD 2018). Abstract Graph classification is a problem with practic

Benedek Rozemberczki 259 Dec 05, 2022
Data loaders and abstractions for text and NLP

torchtext This repository consists of: torchtext.datasets: The raw text iterators for common NLP datasets torchtext.data: Some basic NLP building bloc

3.2k Jan 08, 2023
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Ported from https://github.com/xinntao/BasicSR Dependencie

Holy Wu 8 Jun 07, 2022
⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

Tony Chou 240 Dec 05, 2022
AIR^2 for Interaction Prediction

This is the repository for AIR^2 for Interaction Prediction. Explanation of the solution: Video: link License AIR is released under the Apache 2.0 lic

21 Sep 27, 2022
Experiments with Fourier layers on simulation data.

Factorized Fourier Neural Operators This repository contains the code to reproduce the results in our NeurIPS 2021 ML4PS workshop paper, Factorized Fo

Alasdair Tran 57 Dec 25, 2022