AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

AsymmetricGAN for Image-to-Image Translation

AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation

UN_Framework

AsymmetricGAN Framework for Hand Gesture-to-Gesture Translation

SU_Framework

Conference paper | Extended paper | Project page | Slides | Poster

Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.
Hao Tang1, Dan Xu2, Wei Wang3, Yan Yan4 and Nicu Sebe1.
1University of Trento, Italy, 2University of Oxford, UK, 3EPFL, Switzerland, 4Texas State University, USA.
In ACCV 2018 (Oral).
The repository offers the official implementation of our paper in PyTorch.

License

Copyright (C) 2019 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Installation

Clone this repo.

git clone https://github.com/Ha0Tang/AsymmetricGAN
cd AsymmetricGAN/

This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by

pip install -r requirements.txt (for pip users)

or

./scripts/conda_deps.sh (for Conda users)

To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.

Dataset Preparation

For hand gesture-to-gesture translation task, we use NTU Hand Digit and Creative Senz3D datasets. Both datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, follow GestureGAN to prepare both datasets. Please cite their papers if you use the data.

Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use AsymmetricGAN for your own datasets and tasks.

Generating Images Using Pretrained Model

Once the dataset is ready. The result images can be generated using pretrained models.

  1. You can download a pretrained model (e.g. ntu_asymmetricgan) with the following script:
bash ./scripts/download_asymmetricgan_model.sh ntu_asymmetricgan

The pretrained model is saved at ./checkpoints/[type]_pretrained. Check here for all the available AsymmetricGAN models.

  1. Generate images using the pretrained model.

For NTU Dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D Dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

If you are running on CPU mode, change --gpu_ids 0 to --gpu_ids -1. Note that testing requires a lot of time and large amount of disk storage space. If you don't have enough space, append --saveDisk on the command line.

  1. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Training New Models

New models can be trained with the following commands.

  1. Prepare dataset.

  2. Train.

For NTU dataset:

export CUDA_VISIBLE_DEVICES=3,4;
python train.py --dataroot ./datasets/ntu \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

For Senz3D dataset:

export CUDA_VISIBLE_DEVICES=5,7;
python train.py --dataroot ./datasets/senz3d \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID].

To view training results and loss plots on local computers, set --display_id to a non-zero value and run python -m visdom.server on a new terminal and click the URL http://localhost:8097. On a remote server, replace localhost with your server's name, such as http://server.trento.cs.edu:8097.

Can I continue/resume my training?

To fine-tune a pre-trained model, or resume the previous training, use the --continue_train --which_epoch --epoch_count flag. The program will then load the model based on epoch you set in --which_epoch . Set --epoch_count to specify a different starting epoch count.

Testing

Testing is similar to testing pretrained models.

For NTU dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

Use --how_many to specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • models/asymmetricgan_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models for GestureGAN.
  • options/: creates option lists using argparse package.
  • data/: defines the class for loading images and controllable structures.

Evaluation Code

We use several metrics to evaluate the quality of the generated images:

To Do List

  • Upload supervised AsymmetricGAN code for hand gesture-to-gesture translation
  • Upload unsupervised AsymmetricGAN code for multi-domain image-to-image translation: code

Citation

If you use this code for your research, please cite our papers.

@article{tang2019asymmetric,
  title={Asymmetric Generative Adversarial Networks for Image-to-Image Translation},
  author={Hao Tang and Dan Xu and Hong Liu and Nicu Sebe},
  journal={arXiv preprint arXiv:1912.06931},
  year={2019}
}

@inproceedings{tang2018dual,
  title={Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation},
  author={Tang, Hao and Xu, Dan and Wang, Wei and Yan, Yan and Sebe, Nicu},
  booktitle={ACCV},
  year={2018}
}

Acknowledgments

This source code is inspired by Pix2pix and GestureGAN.

Related Projects

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Human-Segmentation-PyTorch Human segmentation models, training/inference code, and trained weights, implemented in PyTorch. Supported networks UNet: b

Thuy Ng 474 Dec 19, 2022
This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?".

Patches Are All You Need? 🤷 This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?". Code ov

ICLR 2022 Author 934 Dec 30, 2022
Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

T2Net Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021) [Paper][Code] Dependencies numpy==1.18.5 scikit_image==

64 Nov 23, 2022
SynNet - synthetic tree generation using neural networks

SynNet This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can s

Wenhao Gao 60 Dec 29, 2022
PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines. We've created a system in which you can easily select and

Medical Machine Learning Lab - University of Münster 57 Nov 12, 2022
Code/data of the paper "Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction" (BMVC2021)

Hand-Object Contact Prediction (BMVC2021) This repository contains the code and data for the paper "Hand-Object Contact Prediction via Motion-Based Ps

Takuma Yagi 13 Nov 07, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer.

Unsupervised_IEPGAN This is the PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer. Ha

25 Oct 26, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
Image-to-Image Translation in PyTorch

CycleGAN and pix2pix in PyTorch New: Please check out contrastive-unpaired-translation (CUT), our new unpaired image-to-image translation model that e

Jun-Yan Zhu 19k Jan 07, 2023
Applying PVT to Semantic Segmentation

Applying PVT to Semantic Segmentation Here, we take MMSegmentation v0.13.0 as an example, applying PVTv2 to SemanticFPN. For details see Pyramid Visio

35 Nov 30, 2022
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

End-to-End Object Detection with Learnable Proposal, CVPR2021

Peize Sun 1.2k Dec 27, 2022
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022
implementation for paper "ShelfNet for fast semantic segmentation"

ShelfNet-lightweight for paper (ShelfNet for fast semantic segmentation) This repo contains implementation of ShelfNet-lightweight models for real-tim

Juntang Zhuang 252 Sep 16, 2022
Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021.

Conformal time-series forecasting Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021. If you use our code in yo

Kamilė Stankevičiūtė 36 Nov 21, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 04, 2023