An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Related tags

Deep LearningUformer
Overview

Uformer: A General U-Shaped Transformer for Image Restoration

Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu

PWC PWC

Paper: https://arxiv.org/abs/2106.03106

Update:

  • 2021.08.19 Release a pre-trained model(Uformer32)! Add a script for FLOP/GMAC calculation.
  • 2021.07.29 Add a script for testing the pre-trained model on the arbitrary-resolution images.

In this paper, we present Uformer, an effective and efficient Transformer-based architecture, in which we build a hierarchical encoder-decoder network using the Transformer block for image restoration. Uformer has two core designs to make it suitable for this task. The first key element is a local-enhanced window Transformer block, where we use non-overlapping window-based self-attention to reduce the computational requirement and employ the depth-wise convolution in the feed-forward network to further improve its potential for capturing local context. The second key element is that we explore three skip-connection schemes to effectively deliver information from the encoder to the decoder. Powered by these two designs, Uformer enjoys a high capability for capturing useful dependencies for image restoration. Extensive experiments on several image restoration tasks demonstrate the superiority of Uformer, including image denoising, deraining, deblurring and demoireing. We expect that our work will encourage further research to explore Transformer-based architectures for low-level vision tasks.

Uformer

Details

Package dependencies

The project is built with PyTorch 1.7.1, Python3.7, CUDA10.1. For package dependencies, you can install them by:

pip3 install -r requirements.txt

Pretrained model

Data preparation

Denoising

For training data of SIDD, you can download the SIDD-Medium dataset from the official url. Then generate training patches for training by:

python3 generate_patches_SIDD.py --src_dir ../SIDD_Medium_Srgb/Data --tar_dir ../datasets/denoising/sidd/train

For evaluation, we use the same evaluation data as here, and put it into the dir ../datasets/denoising/sidd/val.

Training

Denoising

To train Uformer32(embed_dim=32) on SIDD, we use 2 V100 GPUs and run for 250 epochs:

python3 ./train.py --arch Uformer --batch_size 32 --gpu '0,1' \
    --train_ps 128 --train_dir ../datasets/denoising/sidd/train --env 32_0705_1 \
    --val_dir ../datasets/denoising/sidd/val --embed_dim 32 --warmup

More configuration can be founded in train.sh.

Evaluation

Denoising

To evaluate Uformer32 on SIDD, you can run:

python3 ./test.py --arch Uformer --batch_size 1 --gpu '0' \
    --input_dir ../datasets/denoising/sidd/val --result_dir YOUR_RESULT_DIR \
    --weights YOUR_PRETRAINED_MODEL_PATH --embed_dim 32 

Computational Cost

We provide a simple script to calculate the flops by ourselves, a simple script has been added in model.py. You can change the configuration and run it via:

python3 model.py

The manual calculation of GMacs in this repo differs slightly from the main paper, but they do not influence the conclusion. We will correct the paper later.

Citation

If you find this project useful in your research, please consider citing:

@article{wang2021uformer,
	title={Uformer: A General U-Shaped Transformer for Image Restoration},
	author={Wang, Zhendong and Cun, Xiaodong and Bao, Jianmin and Liu, Jianzhuang},
	journal={arXiv preprint 2106.03106},
	year={2021}
}

Acknowledgement

This code borrows heavily from MIRNet and SwinTransformer.

Contact

Please contact us if there is any question or suggestion(Zhendong Wang [email protected], Xiaodong Cun [email protected]).

Owner
Zhendong Wang
Deep learning, Computer Vision, Low-level Vision, Image Generation.
Zhendong Wang
Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Explaining in Style: Official TensorFlow Colab Explaining in Style: Training a GAN to explain a classifier in StyleSpace Oran Lang, Yossi Gandelsman,

Google 197 Nov 08, 2022
Codebase for Diffusion Models Beat GANS on Image Synthesis.

Codebase for Diffusion Models Beat GANS on Image Synthesis.

Katherine Crowson 128 Dec 02, 2022
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
PyTorch implementation of Convolutional Neural Fabrics http://arxiv.org/abs/1606.02492

PyTorch implementation of Convolutional Neural Fabrics arxiv:1606.02492 There are some minor differences: The raw image is first convolved, to obtain

Anuvabh Dutt 25 Dec 22, 2021
U-Net for GBM

My Final Year Project(FYP) In National University of Singapore(NUS) You need Pytorch(stable 1.9.1) Both cuda version and cpu version are OK File Str

PinkR1ver 1 Oct 27, 2021
Implicit Deep Adaptive Design (iDAD)

Implicit Deep Adaptive Design (iDAD) This code supports the NeurIPS paper 'Implicit Deep Adaptive Design: Policy-Based Experimental Design without Lik

Desi 12 Aug 14, 2022
QuadTree Attention for Vision Transformers (ICLR2022)

This repository contains codes for quadtree attention. This repo contains codes for feature matching, image classficiation, object detection and seman

tangshitao 222 Dec 28, 2022
ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

ICON: Implicit Clothed humans Obtained from Normals Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black CVPR 2022 News 🚩 [2022/04/26] H

Yuliang Xiu 1.1k Jan 04, 2023
Harmonic Memory Networks for Graph Completion

HMemNetworks Code and documentation for Harmonic Memory Networks, a series of models for compositionally assembling representations of graph elements

mlalisse 0 Oct 27, 2021
Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers This is the repo used for human motion prediction with non-autoregress

Idiap Research Institute 26 Dec 14, 2022
The "breathing k-means" algorithm with datasets and example notebooks

The Breathing K-Means Algorithm (with examples) The Breathing K-Means is an approximation algorithm for the k-means problem that (on average) is bette

Bernd Fritzke 75 Nov 17, 2022
Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.

Iris Species Predictor Iris prediction is used to classify iris species using their sepal length, sepal width, petal length and petal width created us

Siva Prakash 2 Jan 06, 2022
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

notAI.tech 1.1k Dec 29, 2022
A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

[AAAI22] Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification We point out the overlooked unbiasedness in long-tailed clas

PatatiPatata 28 Oct 18, 2022
Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Hector Kohler 0 Mar 30, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

Facebook Research 85 Jan 02, 2023
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

1.1k Dec 30, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023