TSIT: A Simple and Versatile Framework for Image-to-Image Translation

Overview

TSIT: A Simple and Versatile Framework for Image-to-Image Translation

teaser

This repository provides the official PyTorch implementation for the following paper:

TSIT: A Simple and Versatile Framework for Image-to-Image Translation
Liming Jiang, Changxu Zhang, Mingyang Huang, Chunxiao Liu, Jianping Shi and Chen Change Loy
In ECCV 2020 (Spotlight).
Paper

Abstract: We introduce a simple and versatile framework for image-to-image translation. We unearth the importance of normalization layers, and provide a carefully designed two-stream generative model with newly proposed feature transformations in a coarse-to-fine fashion. This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network, permitting our method to scale to various tasks in both unsupervised and supervised settings. No additional constraints (e.g., cycle consistency) are needed, contributing to a very clean and simple method. Multi-modal image synthesis with arbitrary style control is made possible. A systematic study compares the proposed method with several state-of-the-art task-specific baselines, verifying its effectiveness in both perceptual quality and quantitative evaluations.

Updates

  • [01/2021] The code of TSIT is released.

  • [07/2020] The paper of TSIT is accepted by ECCV 2020 (Spotlight).

Installation

After installing Anaconda, we recommend you to create a new conda environment with python 3.7.6:

conda create -n tsit python=3.7.6 -y
conda activate tsit

Clone this repo, install PyTorch 1.1.0 (newer versions may also work) and other dependencies:

git clone https://github.com/EndlessSora/TSIT.git
cd TSIT
pip install -r requirements.txt

This code also requires the Synchronized-BatchNorm-PyTorch:

cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
rm -rf Synchronized-BatchNorm-PyTorch
cd ../../

Tasks and Datasets

The code covers 3 image-to-image translation tasks on 5 datasets. For more details, please refer to our paper.

Task Abbreviations

  • Arbitrary Style Transfer (AST) on Yosemite summer → winter, BDD100K day → night, and Photo → art datasets.
  • Semantic Image Synthesis (SIS) on Cityscapes and ADE20K datasets.
  • Multi-Modal Image Synthesis (MMIS) on BDD100K sunny → different time/weather conditions dataset.

The abbreviations are used to specify the --task argument when training and testing.

Dataset Preparation

We provide one-click scripts to prepare datasets. The details are provided below.

  • Yosemite summer → winter and Photo → art. The provided scripts will make all things ready (including the download). For example, simply run:
bash datasets/prepare_summer2winteryosemite.sh
  • BDD100K. Please first download BDD100K Images on their official website. We have provided the classified lists of different weathers and times. After downloading, you only need to run:
bash datasets/prepare_bdd100k.sh [data_root]

The [data_root] should be specified, which is the path to the BDD100K root folder that contains images folder. The script will put the list to the suitable place and symlink the root folder to ./datasets.

  • Cityscapes. Please follow the standard download and preparation guidelines on the official website. We recommend to symlink its root folder [data_root] to ./datasets by:
bash datasets/prepare_cityscapes.sh [data_root]
  • ADE20K. The dataset can be downloaded here, which is from MIT Scene Parsing BenchMark. After unzipping the dataset, put the jpg image files ADEChallengeData2016/images/ and png label files ADEChallengeData2016/annotatoins/ in the same directory. We also recommend to symlink its root folder [data_root] to ./datasets by:
bash datasets/prepare_ade20k.sh [data_root]

Testing Pretrained Models

  1. Download the pretrained models and unzip them to ./checkpoints.

  2. For a quick start, we have provided all the example test scripts. After preparing the corresponding datasets, you can directly use the test scripts. For example:

bash test_scripts/ast_summer2winteryosemite.sh
  1. The generated images will be saved at ./results/[experiment_name] by default.

  2. You can use --results_dir to specify the output directory. --how_many will specify the maximum number of images to generate. By default, the code loads the latest checkpoint, which can be changed using --which_epoch. You can also discard --show_input to show the generated images only without the input references.

  3. For MMIS sunny → different time/weather conditions, the --test_mode can be specified (optional): night | cloudy | rainy | snowy | all (default).

Training

For a quick start, we have provided all the example training scripts. After preparing the corresponding datasets, you can directly use the training scripts. For example:

bash train_scripts/ast_summer2winteryosemite.sh

Please note that you may want to change the experiment name --name or the checkpoint saving root --checkpoints_dir to prevent your newly trained models overwriting the pretrained ones (if used).

--task is given using the abbreviations. --dataset_mode specifies the dataset type. --croot and --sroot specify the content and style data root, respectively. The results may be better reproduced on NVIDIA Tesla V100 GPUs.

After training, testing the newly trained models is similar to testing pretrained models.

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{jiang2020tsit,
  title={{TSIT}: A Simple and Versatile Framework for Image-to-Image Translation},
  author={Jiang, Liming and Zhang, Changxu and Huang, Mingyang and Liu, Chunxiao and Shi, Jianping and Loy, Chen Change},
  booktitle={ECCV},
  year={2020}
}

Acknowledgments

The code is greatly inspired by SPADE, pytorch-AdaIN, and Synchronized-BatchNorm-PyTorch.

License

Copyright (c) 2020. All rights reserved.

The code is licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International).

Owner
Liming Jiang
Ph.D. student, [email protected]
Liming Jiang
Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

Dave Pagurek 44 Dec 01, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
a basic code repository for basic task in CV(classification,detection,segmentation)

basic_cv a basic code repository for basic task in CV(classification,detection,segmentation,tracking) classification generate dataset train predict de

1 Oct 15, 2021
TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network Created by Seunghoon Hong, Junhyuk Oh,

42 Jun 29, 2022
PyTorch implementation of SIFT descriptor

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 01, 2022
Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction Requirements The code has been tested running under Python 3.7.4, with the foll

zshicode 84 Jan 01, 2023
Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Covid-Tracker This is an interactive website that tracks, models and predicts CO

Adam Lahmadi 1 Feb 01, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
AttGAN: Facial Attribute Editing by Only Changing What You Want (IEEE TIP 2019)

News 11 Jan 2020: We clean up the code to make it more readable! The old version is here: v1. AttGAN TIP Nov. 2019, arXiv Nov. 2017 TensorFlow impleme

Zhenliang He 568 Dec 14, 2022
Official code for the publication "HyFactor: Hydrogen-count labelled graph-based defactorization Autoencoder".

HyFactor Graph-based architectures are becoming increasingly popular as a tool for structure generation. Here, we introduce a novel open-source archit

Laboratoire-de-Chemoinformatique 11 Oct 10, 2022
CROSS-LINGUAL ABILITY OF MULTILINGUAL BERT: AN EMPIRICAL STUDY

M-BERT-Study CROSS-LINGUAL ABILITY OF MULTILINGUAL BERT: AN EMPIRICAL STUDY Motivation Multilingual BERT (M-BERT) has shown surprising cross lingual a

CogComp 1 Feb 28, 2022
Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Think Big, Teach Small: Do Language Models Distil Occam’s Razor? Software related to the paper "Think Big, Teach Small: Do Language Models Distil Occa

0 Dec 07, 2021
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Volume rendering + 3D implicit surface Showcase What? previous: surface rendering; now: volume rendering previous: NeRF's volume density; now: implici

Jianfei Guo 682 Jan 04, 2023
Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*. The algorithm was extremely

1 Mar 28, 2022
Wikidated : An Evolving Knowledge Graph Dataset of Wikidata’s Revision History

Wikidated Wikidated 1.0 is a dataset of Wikidata’s full revision history, which encodes changes between Wikidata revisions as sets of deletions and ad

Lukas Schmelzeisen 11 Aug 16, 2022
Linear Variational State Space Filters

Linear Variational State Space Filters To set up the environment, use the provided scripts in the docker/ folder to build and run the codebase inside

0 Dec 13, 2021
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
LegoDNN: a block-grained scaling tool for mobile vision systems

Table of contents 1 Introduction 1.1 Major features 1.2 Architecture 2 Code and Installation 2.1 Code 2.2 Installation 3 Repository of DNNs in vision

41 Dec 24, 2022
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

SOLQ: Segmenting Objects by Learning Queries This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

MEGVII Research 179 Jan 02, 2023