Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Related tags

Deep LearningURST
Overview

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Official PyTorch implementation for our URST (Ultra-Resolution Style Transfer) framework.

URST is a versatile framework for ultra-high resolution style transfer under limited memory resources, which can be easily plugged in most existing neural style transfer methods.

With the growth of the input resolution, the memory cost of our URST hardly increases. Theoretically, it supports style transfer of arbitrary high-resolution images.

One ultra-high resolution stylized result of 12000 x 8000 pixels (i.e., 96 megapixels).

This repository is developed based on six representative style transfer methods, which are Johnson et al., MSG-Net, AdaIN, WCT, LinearWCT, and Wang et al. (Collaborative Distillation).

For details see Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization.

If you use this code for a paper please cite:

@misc{chen2021towards,
      title={Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization}, 
      author={Zhe Chen and Wenhai Wang and Enze Xie and Tong Lu and Ping Luo},
      year={2021},
      eprint={2103.11784},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Environment

  • python3.6, pillow, tqdm, torchfile, pytorch1.1+ (for inference)

    pip install pillow
    pip install tqdm
    pip install torchfile
    conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch
  • tensorboardX (for training)

    pip install tensorboardX

Then, clone the repository locally:

git clone https://github.com/czczup/URST.git

Test (Ultra-high Resolution Style Transfer)

Step 1: Prepare images

  • Content images and style images are placed in examples/.
  • Since the ultra-high resolution images are quite large, we not place them in this repository. Please download them from this google drive.
  • All content images used in this repository are collected from pexels.com.

Step 2: Prepare models

  • Download models from this google drive. Unzip and merge them into this repository.

Step 3: Stylization

First, choose a specific style transfer method and enter the directory.

Then, please run the corresponding script. The stylized results will be saved in output/.

  • For Johnson et al., we use the PyTorch implementation Fast-Neural-Style-Transfer.

    cd Johnson2016Perceptual/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --model <model_path> --URST
  • For MSG-Net, we use the official PyTorch implementation PyTorch-Multi-Style-Transfer.

    cd Zhang2017MultiStyle/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For AdaIN, we use the PyTorch implementation pytorch-AdaIN.

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For WCT, we use the PyTorch implementation PytorchWCT.

    cd Li2017Universal/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For LinearWCT, we use the official PyTorch implementation LinearStyleTransfer.

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For Wang et al. (Collaborative Distillation), we use the official PyTorch implementation Collaborative-Distillation.

    cd Wang2020Collaborative/PytorchWCT/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST

Optional options:

  • --patch_size: The maximum size of each patch. The default setting is 1000.
  • --style_size: The size of the style image. The default setting is 1024.
  • --thumb_size: The size of the thumbnail image. The default setting is 1024.
  • --URST: Use our URST framework to process ultra-high resolution images.

Train (Enlarge the Stroke Size)

Step 1: Prepare datasets

Download the MS-COCO 2014 dataset and WikiArt dataset.

  • MS-COCO

    wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip
  • WikiArt

    • Either manually download from kaggle.
    • Or install kaggle-cli and download by running:
    kg download -u <username> -p <password> -c painter-by-numbers -f train.zip

Step 2: Prepare models

As same as the Step 2 in the test phase.

Step 3: Train the decoder with our stroke perceptual loss

  • For AdaIN:

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --content_dir <coco_path> --style_dir <wikiart_path>
  • For LinearWCT:

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --contentPath <coco_path> --stylePath <wikiart_path>

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Owner
czczup
Knowledge is infinite.
czczup
[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 121 Dec 12, 2022
[TPAMI 2021] iOD: Incremental Object Detection via Meta-Learning

Incremental Object Detection via Meta-Learning To appear in an upcoming issue of the IEEE Transactions on Pattern Analysis and Machine Intelligence (T

Joseph K J 66 Jan 04, 2023
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
Benchmark VAE - Library for Variational Autoencoder benchmarking

Documentation pythae This library implements some of the most common (Variational) Autoencoder models. In particular it provides the possibility to pe

1.1k Jan 02, 2023
Distributed Asynchronous Hyperparameter Optimization in Python

Hyperopt: Distributed Hyperparameter Optimization Hyperopt is a Python library for serial and parallel optimization over awkward search spaces, which

6.5k Jan 01, 2023
Fast Learning of MNL Model From General Partial Rankings with Application to Network Formation Modeling

Fast-Partial-Ranking-MNL This repo provides a PyTorch implementation for the CopulaGNN models as described in the following paper: Fast Learning of MN

Xingjian Zhang 3 Aug 19, 2022
Prompts - Read a textfile of prompts and import into anki via ankiconnect

prompts read a textfile of prompts and import into anki via ankiconnect Usage In

Alexander Cobleigh 2 Jul 28, 2022
Callable PyTrees and filtered JIT/grad transformations => neural networks in JAX.

Equinox Callable PyTrees and filtered JIT/grad transformations = neural networks in JAX Equinox brings more power to your model building in JAX. Repr

Patrick Kidger 909 Dec 30, 2022
Scalable Optical Flow-based Image Montaging and Alignment

SOFIMA SOFIMA (Scalable Optical Flow-based Image Montaging and Alignment) is a tool for stitching, aligning and warping large 2d, 3d and 4d microscopy

Google Research 16 Dec 21, 2022
Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Official PyTorch code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction. Guanglei Yang, Hao Tang, Mingli Ding, Nicu Sebe,

stanley 152 Dec 16, 2022
Deep learning library featuring a higher-level API for TensorFlow.

TFLearn: Deep learning library featuring a higher-level API for TensorFlow. TFlearn is a modular and transparent deep learning library built on top of

TFLearn 9.6k Jan 02, 2023
Multi-Template Mouse Brain MRI Atlas (MBMA): both in-vivo and ex-vivo

Multi-template MRI mouse brain atlas (both in vivo and ex vivo) Mouse Brain MRI atlas (both in-vivo and ex-vivo) (repository relocated from the origin

8 Nov 18, 2022
BABEL: Bodies, Action and Behavior with English Labels [CVPR 2021]

BABEL is a large dataset with language labels describing the actions being performed in mocap sequences. BABEL labels about 43 hours of mocap sequences from AMASS [1] with action labels.

113 Dec 28, 2022
JAX + dataclasses

jax_dataclasses jax_dataclasses provides a wrapper around dataclasses.dataclass for use in JAX, which enables automatic support for: Pytree registrati

Brent Yi 35 Dec 21, 2022
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation [Project] [Paper] [arXiv] [Home] Official implementation of FastFCN:

Wu Huikai 815 Dec 29, 2022
OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

Shu Kong 90 Jan 06, 2023
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image This repo contains the source code for MobileHand, real-time estimation of 3D

90 Dec 12, 2022
Code repository for the paper Computer Vision User Entity Behavior Analytics

Computer Vision User Entity Behavior Analytics Code repository for "Computer Vision User Entity Behavior Analytics" Code Description dataset.csv As di

Sameer Khanna 2 Aug 20, 2022