T2F: text to face generation using Deep Learning

Overview

[NEW]

T2F - 2.0 Teaser (coming soon ...)

2.0 Teaser

Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer link for more info about MSG-GAN. This update to the repository will be comeing soon 👍 .

T2F

Text-to-Face generation using Deep Learning. This project combines two of the recent architectures StackGAN and ProGAN for synthesizing faces from textual descriptions.
The project uses Face2Text dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the RIVAL group or the authors of the aforementioned paper.

Some Examples:

Examples

Architecture:

Architecture Diagram

The textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. Embedding (psy_t) as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.

Running the code:

The code is present in the implementation/ subdirectory. The implementation is done using the PyTorch framework. So, for running this code, please install PyTorch version 0.4.0 before continuing.

Code organization:
configs: contains the configuration files for training the network. (You can use any one, or create your own)
data_processing: package containing data processing and loading modules
networks: package contains network implementation
processed_annotations: directory stores output of running process_text_annotations.py script
process_text_annotations.py: processes the captions and stores output in processed_annotations/ directory. (no need to run this script; the pickle file is included in the repo.)
train_network.py: script for running the training the network

Sample configuration:

# All paths to different required data objects
images_dir: "../data/LFW/lfw"
processed_text_file: "processed_annotations/processed_text.pkl"
log_dir: "training_runs/11/losses/"
sample_dir: "training_runs/11/generated_samples/"
save_dir: "training_runs/11/saved_models/"

# Hyperparameters for the Model
captions_length: 100
img_dims:
  - 64
  - 64

# LSTM hyperparameters
embedding_size: 128
hidden_size: 256
num_layers: 3  # number of LSTM cells in the encoder network

# Conditioning Augmentation hyperparameters
ca_out_size: 178

# Pro GAN hyperparameters
depth: 5
latent_size: 256
learning_rate: 0.001
beta_1: 0
beta_2: 0
eps: 0.00000001
drift: 0.001
n_critic: 1

# Training hyperparameters:
epochs:
  - 160
  - 80
  - 40
  - 20
  - 10

# % of epochs for fading in the new layer
fade_in_percentage:
  - 85
  - 85
  - 85
  - 85
  - 85

batch_sizes:
  - 16
  - 16
  - 16
  - 16
  - 16

num_workers: 3
feedback_factor: 7  # number of logs generated per epoch
checkpoint_factor: 2  # save the models after these many epochs
use_matching_aware_discriminator: True  # use the matching aware discriminator

Use the requirements.txt to install all the dependencies for the project.

$ workon [your virtual environment]
$ pip install -r requirements.txt

Sample run:

$ mkdir training_runs
$ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models
$ train_network.py --config=configs/11.comf

Other links:

blog: https://medium.com/@animeshsk3/t2f-text-to-face-generation-using-deep-learning-b3b6ba5a5a93
training_time_lapse video: https://www.youtube.com/watch?v=NO_l87rPDb8
ProGAN package (Seperate library): https://github.com/akanimax/pro_gan_pytorch

#TODO:

1.) Create a simple demo.py for running inference on the trained models

Owner
Animesh Karnewar
PhD @smartgeometry-ucl | Marie Curie Fellow for PRIME-ITN | Interested in: 3D deep learning, generative modelling, computer graphics, geometric deep learning
Animesh Karnewar
Synthetic Scene Text from 3D Engines

Introduction UnrealText is a project that synthesizes scene text images using 3D graphics engine. This repository accompanies our paper: UnrealText: S

Shangbang Long 215 Dec 29, 2022
A new version of the CIDACS-RL linkage tool suitable to a cluster computing environment.

Fully Distributed CIDACS-RL The CIDACS-RL is a brazillian record linkage tool suitable to integrate large amount of data with high accuracy. However,

Robespierre Pita 5 Nov 04, 2022
A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

Description A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization co

AoxiangFan 9 Nov 10, 2022
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

NVIDIA Corporation 1.8k Dec 30, 2022
This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

inverse_attention This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021. Le

Firas Laakom 5 Jul 08, 2022
Alex Pashevich 62 Dec 24, 2022
Label Studio is a multi-type data labeling and annotation tool with standardized output format

Website • Docs • Twitter • Join Slack Community What is Label Studio? Label Studio is an open source data labeling tool. It lets you label data types

Heartex 11.7k Jan 09, 2023
Auto White-Balance Correction for Mixed-Illuminant Scenes

Auto White-Balance Correction for Mixed-Illuminant Scenes Mahmoud Afifi, Marcus A. Brubaker, and Michael S. Brown York University Video Reference code

Mahmoud Afifi 47 Nov 26, 2022
Research on Event Accumulator Settings for Event-Based SLAM

Research on Event Accumulator Settings for Event-Based SLAM This is the source code for paper "Research on Event Accumulator Settings for Event-Based

Robin Shaun 26 Dec 21, 2022
YKKDetector For Python

YKKDetector OpenCVを利用した機械学習データをもとに、VRChatのスクリーンショットなどからYKKさん(もとい「幽狐族のお姉様」)を検出できるソフトウェアです。 マニュアル こちらから実行環境のセットアップから解説する詳細なマニュアルをご覧いただけます。 ライセンス 本ソフトウェア

あんふぃとらいと 5 Dec 07, 2021
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Seunghyun Lee 12 Oct 18, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

5 Steps to Speed Up Your Data-Analysis on a Single Core Material for my talk at the PyConDE & PyData Berlin 2022 Description Your data analysis pipeli

Jonathan Striebel 9 Dec 12, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 07, 2022
PyTorch-based framework for Deep Hedging

PFHedge: Deep Hedging in PyTorch PFHedge is a PyTorch-based framework for Deep Hedging. PFHedge Documentation Neural Network Architecture for Efficien

139 Dec 30, 2022
Fast Soft Color Segmentation

Fast Soft Color Segmentation

3 Oct 29, 2022
Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

Himashi Amanda Peiris 114 Dec 20, 2022
patchmatch和patchmatchstereo算法的python实现

patchmatch patchmatch以及patchmatchstereo算法的python版实现 patchmatch参考 github patchmatchstereo参考李迎松博士的c++版代码 由于patchmatchstereo没有做任何优化,并且是python的代码,主要是方便解析算

Sanders Bao 11 Dec 02, 2022