Pre-Trained Image Processing Transformer (IPT)

Overview

Pre-Trained Image Processing Transformer (IPT)

By Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao. [arXiv]

We study the low-level computer vision task (such as denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). We present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

MindSpore Code

Requirements

  • python 3
  • pytorch == 1.4.0
  • torchvision

Dataset

The benchmark datasets can be downloaded as follows:

For super-resolution:

Set5, Set14, B100, Urban100.

For denoising:

CBSD68, Urban100.

For deraining:

Rain100L.

The result images are converted into YCbCr color space. The PSNR is evaluated on the Y channel only.

Script Description

This is the inference script of IPT, you can following steps to finish the test of image processing tasks, like SR, denoise and derain, via the corresponding pretrained models.

Script Parameter

For details about hyperparameters, see option.py.

Evaluation

Pretrained models

The pretrained models are available in google drive

Evaluation Process

Inference example: For SR x2,x3,x4:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test Set5+Set14+B100+Urban100 --scale $SCALE

For Denoise 30,50:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test CBSD68+Urban100 --scale 1 --denoise --sigma $NOISY_LEVEL

For derain:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --scale 1 --derain

Results

  • Detailed results on image super-resolution task.
Method Scale Set5 Set14 B100 Urban100
VDSR X2 37.53 33.05 31.90 30.77
EDSR X2 38.11 33.92 32.32 32.93
RCAN X2 38.27 34.12 32.41 33.34
RDN X2 38.24 34.01 32.34 32.89
OISR-RK3 X2 38.21 33.94 32.36 33.03
RNAN X2 38.17 33.87 32.32 32.73
SAN X2 38.31 34.07 32.42 33.1
HAN X2 38.27 34.16 32.41 33.35
IGNN X2 38.24 34.07 32.41 33.23
IPT (ours) X2 38.37 34.43 32.48 33.76
Method Scale Set5 Set14 B100 Urban100
VDSR X3 33.67 29.78 28.83 27.14
EDSR X3 34.65 30.52 29.25 28.80
RCAN X3 34.74 30.65 29.32 29.09
RDN X3 34.71 30.57 29.26 28.80
OISR-RK3 X3 34.72 30.57 29.29 28.95
RNAN X3 34.66 30.52 29.26 28.75
SAN X3 34.75 30.59 29.33 28.93
HAN X3 34.75 30.67 29.32 29.10
IGNN X3 34.72 30.66 29.31 29.03
IPT (ours) X3 34.81 30.85 29.38 29.49
Method Scale Set5 Set14 B100 Urban100
VDSR X4 31.35 28.02 27.29 25.18
EDSR X4 32.46 28.80 27.71 26.64
RCAN X4 32.63 28.87 27.77 26.82
SAN X4 32.64 28.92 27.78 26.79
RDN X4 32.47 28.81 27.72 26.61
OISR-RK3 X4 32.53 28.86 27.75 26.79
RNAN X4 32.49 28.83 27.72 26.61
HAN X4 32.64 28.90 27.80 26.85
IGNN X4 32.57 28.85 27.77 26.84
IPT (ours) X4 32.64 29.01 27.82 27.26
  • Super-resolution result

  • Denoising result

  • Derain result

Citation

@misc{chen2020pre,
      title={Pre-Trained Image Processing Transformer}, 
      author={Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen},
      year={2021},
      eprint={2012.00364},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Owner
HUAWEI Noah's Ark Lab
Working with and contributing to the open source community in data mining, artificial intelligence, and related fields.
HUAWEI Noah's Ark Lab
Using LSTM to detect spoofing attacks in an Air-Ground network

Using LSTM to detect spoofing attacks in an Air-Ground network Specifications IDE: Spider Packages: Tensorflow 2.1.0 Keras NumPy Scikit-learn Matplotl

Tiep M. H. 1 Nov 20, 2021
This is the winning solution of the Endocv-2021 grand challange.

Endocv2021-winner [Paper] This is the winning solution of the Endocv-2021 grand challange. Dependencies pytorch # tested with 1.7 and 1.8 torchvision

Vajira Thambawita 14 Dec 03, 2022
Deconfounding Temporal Autoencoder: Estimating Treatment Effects over Time Using Noisy Proxies

Deconfounding Temporal Autoencoder (DTA) This is a repository for the paper "Deconfounding Temporal Autoencoder: Estimating Treatment Effects over Tim

Milan Kuzmanovic 3 Feb 04, 2022
Open source Python module for computer vision

About PCV PCV is a pure Python library for computer vision based on the book "Programming Computer Vision with Python" by Jan Erik Solem. More details

Jan Erik Solem 1.9k Jan 06, 2023
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
Neural models of common sense. 🤖

Unicorn on Rainbow Neural models of common sense. This repository is for the paper: Unicorn on Rainbow: A Universal Commonsense Reasoning Model on a N

AI2 60 Jan 05, 2023
Machine Learning with JAX Tutorials

The purpose of this repo is to make it easy to get started with JAX. It contains my "Machine Learning with JAX" series of tutorials (YouTube videos and Jupyter Notebooks) as well as the content I fou

Aleksa Gordić 372 Dec 28, 2022
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Learning to Compose Visual Relations This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations. Demo Imag

Nan Liu 88 Jan 04, 2023
The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

DER.ClassIL.Pytorch This repo is the official implementation of DER: Dynamically Expandable Representation for Class Incremental Learning (CVPR 2021)

rhyssiyan 108 Jan 01, 2023
Tooling for GANs in TensorFlow

TensorFlow-GAN (TF-GAN) TF-GAN is a lightweight library for training and evaluating Generative Adversarial Networks (GANs). Can be installed with pip

803 Dec 24, 2022
A robust pointcloud registration pipeline based on correlation.

PHASER: A Robust and Correspondence-Free Global Pointcloud Registration Ubuntu 18.04+ROS Melodic: Overview Pointcloud registration using correspondenc

ETHZ ASL 101 Dec 01, 2022
[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Visual-Reasoning-eXplanation [CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts] Project Page | Vid

Andy_Ge 54 Dec 21, 2022
Unified MultiWOZ evaluation scripts for the context-to-response task.

MultiWOZ Context-to-Response Evaluation Standardized and easy to use Inform, Success, BLEU ~ See the paper ~ Easy-to-use scripts for standardized eval

Tomáš Nekvinda 38 Dec 13, 2022
Opinionated code formatter, just like Python's black code formatter but for Beancount

beancount-black Opinionated code formatter, just like Python's black code formatter but for Beancount Try it out online here Features MIT licensed - b

Launch Platform 16 Oct 11, 2022
Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

Vision Transformer(ViT) in Tensorflow2 Tensorflow2 implementation of the Vision Transformer(ViT). This repository is for An image is worth 16x16 words

sungjun lee 42 Dec 27, 2022
Hierarchical Time Series Forecasting with a familiar API

scikit-hts Hierarchical Time Series with a familiar API. This is the result from not having found any good implementations of HTS on-line, and my work

Carlo Mazzaferro 204 Dec 17, 2022
SAMO: Streaming Architecture Mapping Optimisation

SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model

Alexander Montgomerie-Corcoran 20 Dec 10, 2022
ZeroVL - The official implementation of ZeroVL

This repository contains source code necessary to reproduce the results presente

31 Nov 04, 2022
Backend code to use MCPI's python API to make infinite worlds with custom generation

inf-mcpi Backend code to use MCPI's python API to make infinite worlds with custom generation Does not save player-placed blocks! Generation is still

5 Oct 04, 2022