AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

AsymmetricGAN for Image-to-Image Translation

AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation

UN_Framework

AsymmetricGAN Framework for Hand Gesture-to-Gesture Translation

SU_Framework

Conference paper | Extended paper | Project page | Slides | Poster

Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.
Hao Tang1, Dan Xu2, Wei Wang3, Yan Yan4 and Nicu Sebe1.
1University of Trento, Italy, 2University of Oxford, UK, 3EPFL, Switzerland, 4Texas State University, USA.
In ACCV 2018 (Oral).
The repository offers the official implementation of our paper in PyTorch.

License

Copyright (C) 2019 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Installation

Clone this repo.

git clone https://github.com/Ha0Tang/AsymmetricGAN
cd AsymmetricGAN/

This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by

pip install -r requirements.txt (for pip users)

or

./scripts/conda_deps.sh (for Conda users)

To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.

Dataset Preparation

For hand gesture-to-gesture translation task, we use NTU Hand Digit and Creative Senz3D datasets. Both datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, follow GestureGAN to prepare both datasets. Please cite their papers if you use the data.

Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use AsymmetricGAN for your own datasets and tasks.

Generating Images Using Pretrained Model

Once the dataset is ready. The result images can be generated using pretrained models.

  1. You can download a pretrained model (e.g. ntu_asymmetricgan) with the following script:
bash ./scripts/download_asymmetricgan_model.sh ntu_asymmetricgan

The pretrained model is saved at ./checkpoints/[type]_pretrained. Check here for all the available AsymmetricGAN models.

  1. Generate images using the pretrained model.

For NTU Dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D Dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

If you are running on CPU mode, change --gpu_ids 0 to --gpu_ids -1. Note that testing requires a lot of time and large amount of disk storage space. If you don't have enough space, append --saveDisk on the command line.

  1. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Training New Models

New models can be trained with the following commands.

  1. Prepare dataset.

  2. Train.

For NTU dataset:

export CUDA_VISIBLE_DEVICES=3,4;
python train.py --dataroot ./datasets/ntu \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

For Senz3D dataset:

export CUDA_VISIBLE_DEVICES=5,7;
python train.py --dataroot ./datasets/senz3d \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID].

To view training results and loss plots on local computers, set --display_id to a non-zero value and run python -m visdom.server on a new terminal and click the URL http://localhost:8097. On a remote server, replace localhost with your server's name, such as http://server.trento.cs.edu:8097.

Can I continue/resume my training?

To fine-tune a pre-trained model, or resume the previous training, use the --continue_train --which_epoch --epoch_count flag. The program will then load the model based on epoch you set in --which_epoch . Set --epoch_count to specify a different starting epoch count.

Testing

Testing is similar to testing pretrained models.

For NTU dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

Use --how_many to specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • models/asymmetricgan_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models for GestureGAN.
  • options/: creates option lists using argparse package.
  • data/: defines the class for loading images and controllable structures.

Evaluation Code

We use several metrics to evaluate the quality of the generated images:

To Do List

  • Upload supervised AsymmetricGAN code for hand gesture-to-gesture translation
  • Upload unsupervised AsymmetricGAN code for multi-domain image-to-image translation: code

Citation

If you use this code for your research, please cite our papers.

@article{tang2019asymmetric,
  title={Asymmetric Generative Adversarial Networks for Image-to-Image Translation},
  author={Hao Tang and Dan Xu and Hong Liu and Nicu Sebe},
  journal={arXiv preprint arXiv:1912.06931},
  year={2019}
}

@inproceedings{tang2018dual,
  title={Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation},
  author={Tang, Hao and Xu, Dan and Wang, Wei and Yan, Yan and Sebe, Nicu},
  booktitle={ACCV},
  year={2018}
}

Acknowledgments

This source code is inspired by Pix2pix and GestureGAN.

Related Projects

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

This repository holds the implementation for paper Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach Download our preproc

Qitian Wu 42 Dec 27, 2022
Large scale PTM - PPI relation extraction

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard

1 Feb 25, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

Torch Mutable Modules Use in-place and assignment operations on PyTorch module p

Kento Nishi 7 Jun 06, 2022
Convnet transfer - Code for paper How transferable are features in deep neural networks?

How transferable are features in deep neural networks? This repository contains source code necessary to reproduce the results presented in the follow

Jason Yosinski 143 Sep 13, 2022
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target i

NanYoMy 13 Oct 09, 2022
Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Neural Magic Eye Preprint | Project Page | Colab Runtime Official PyTorch implementation of the preprint paper "NeuralMagicEye: Learning to See and Un

Zhengxia Zou 56 Jul 15, 2022
Non-Vacuous Generalisation Bounds for Shallow Neural Networks

This package requires jax, tensorflow, and numpy. Either tensorflow or scikit-learn can be used for loading data. To run in a nix-shell with required

Felix Biggs 0 Feb 04, 2022
OSLO: Open Source framework for Large-scale transformer Optimization

O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a

TUNiB 280 Nov 24, 2022
Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

Astitva Veer Garg 1 Jan 13, 2022
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

DBSegment This tool generates 30 deep brain structures segmentation, as well as a brain mask from T1-Weighted MRI. The whole procedure should take ~1

Luxembourg Neuroimaging (Platform OpNeuroImg) 2 Oct 25, 2022
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Volume rendering + 3D implicit surface Showcase What? previous: surface rendering; now: volume rendering previous: NeRF's volume density; now: implici

Jianfei Guo 682 Jan 04, 2023
Official implement of "CAT: Cross Attention in Vision Transformer".

CAT: Cross Attention in Vision Transformer This is official implement of "CAT: Cross Attention in Vision Transformer". Abstract Since Transformer has

100 Dec 15, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

Dongkwan Kim 127 Dec 28, 2022
Sentiment analysis translations of the Bhagavad Gita

Sentiment and Semantic Analysis of Bhagavad Gita Translations It is well known that translations of songs and poems not only breaks rhythm and rhyming

Machine learning and Bayesian inference @ UNSW Sydney 3 Aug 01, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

Few-shot-NLEs These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations". You can find the smal

Yordan Yordanov 0 Oct 21, 2022