[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR)
_{Official PyTorch Implementation of the CVPR 2022 Paper}
_{Project | arXiv | RealMCVSR Dataset}

This repo contains training and evaluation code for the following paper:

Reference-based Video Super-Resolution Using Multi-Camera Video Triplets
Junyong Lee, Myeonghee Lee, Sunghyun Cho, and Seungyong Lee
POSTECH
IEEE Computer Vision and Pattern Recognition (CVPR) 2022

Getting Started

Prerequisites

Tested environment

1. Environment setup

$ git clone https://github.com/codeslake/RefVSR.git
$ cd RefVSR

$ conda create -y name RefVSR python 3.8 && conda activate RefVSR

# Install pytorch
$ conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

# Install requirements
$ ./install/install_cudnn113.sh

It is recommended to install PyTorch >= 1.10.0 with CUDA11.3 for running small models using Pytorch AMP, because PyTorch < 1.10.0 is known to have a problem in running amp with torch.nn.functional.grid_sample() needed for inter-frame alignment.

For the other models, PyTorch 1.8.0 is verified. To install requirements with PyTorch 1.8.0, run ./install/install_cudnn102.sh for CUDA10.2 or ./install/install_cudnn111.sh for CUDA11.1

2. Dataset

Download and unzip the proposed RealMCVSR dataset under [DATA_OFFSET]:

[DATA_OFFSET]
    └── RealMCVSR
        ├── train                       # a training set
        │   ├── HR                      # videos in original resolution 
        │   │   ├── T                   # telephoto videos
        │   │   │   ├── 0002            # a video clip 
        │   │   │   │   ├── 0000.png    # a video frame
        │   │   │   │   └── ...         
        │   │   │   └── ...            
        │   │   ├── UW                  # ultra-wide-angle videos
        │   │   └── W                   # wide-angle videos
        │   ├── LRx2                    # 2x downsampled videos
        │   └── LRx4                    # 4x downsampled videos
        ├── test                        # a testing set
        └── valid                       # a validation set

[DATA_OFFSET] can be modified with --data_offset option in the evaluation script.

3. Pre-trained models

Download pretrained weights (Google Drive | Dropbox) under ./ckpt/:

RefVSR
├── ...
├── ./ckpt
│   ├── edvr.pytorch                    # weights of EDVR modules used for training Ours-IR
│   ├── SPyNet.pytorch                  # weights of SpyNet used for inter-frame alignment
│   ├── RefVSR_small_L1.pytorch         # weights of Ours-small-L1
│   ├── RefVSR_small_MFID.pytorch       # weights of Ours-small
│   ├── RefVSR_small_MFID_8K.pytorch    # weights of Ours-small-8K
│   ├── RefVSR_L1.pytorch               # weights of Ours-L1
│   ├── RefVSR_MFID.pytorch             # weights of Ours
│   ├── RefVSR_MFID_8K.pytorch.pytorch  # weights of Ours-8K
│   ├── RefVSR_IR_MFID.pytorch          # weights of Ours-IR
│   └── RefVSR_IR_L1.pytorch            # weights of Ours-IR-L1
└── ...

For the testing and training of your own model, it is recommended to go through wiki pages for
logging and details of testing and training scripts before running the scripts.

Testing models of CVPR 2022

Evaluation script

CUDA_VISIBLE_DEVICES=0 python -B run.py \
    --mode _RefVSR_MFID_8K \                       # name of the model to evaluate
    --config config_RefVSR_MFID_8K \               # name of the configuration file in ./configs
    --data RealMCVSR \                             # name of the dataset
    --ckpt_abs_name ckpt/RefVSR_MFID_8K.pytorch \  # absolute path for the checkpoint
    --data_offset /data1/junyonglee \              # offset path for the dataset (e.g., [DATA_OFFSET]/RealMCVSR)
    --output_offset ./result                       # offset path for the outputs

Real-world 4x video super-resolution (HD to 8K resolution)

# Evaluating the model 'Ours' (Fig. 8 in the main paper).
$ ./scripts_eval/eval_RefVSR_MFID_8K.sh

# Evaluating the model 'Ours-small'.
$ ./scripts_eval/eval_amp_RefVSR_small_MFID_8K.sh

For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small,

We use Nvidia GeForce RTX 3090 (24GB) in practice.

It is the model Ours-small in Table 2 further trained with the adaptation stage.

The model requires PyTorch >= 1.10.0 with CUDA 11.3 for using PyTorch AMP.

Quantitative evaluation (models trained with the pre-training stage)

## Table 2 in the main paper
# Ours
$ ./scripts_eval/eval_RefVSR_MFID.sh

# Ours-l1
$ ./scripts_eval/eval_RefVSR_L1.sh

# Ours-small
$ ./scripts_eval/eval_amp_RefVSR_small_MFID.sh

# Ours-small-l1
$ ./scripts_eval/eval_amp_RefVSR_small_L1.sh

# Ours-IR
$ ./scripts_eval/eval_RefVSR_IR_MFID.sh

# Ours-IR-l1
$ ./scripts_eval/eval_RefVSR_IR_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

To obtain quantitative results measured with the varying FoV ranges as shown in Table 3 of the main paper, modify the script and specify --eval_mode FOV.

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

# To train the model 'Ours':
$ ./scripts_train/train_RefVSR_MFID.sh

# To train the model 'Ours-small':
$ ./scripts_train/train_amp_RefVSR_small_MFID.sh

For both models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

We use the total batch size of 4, the multiplication of numbers in options --nproc_per_node and -b.

The adaptation stage (Sec. 4.2)

Set the path of the checkpoint of a model trained with the pre-training stage.
For the model Ours-small, for example,
```
$ vim ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
```
```
#!/bin/bash

py3clean ./
CUDA_VISIBLE_DEVICES=0,1 ...
    ...
    -ra [LOG_OFFSET]/RefVSR_CVPR2022/amp_RefVSR_small_MFID/checkpoint/train/epoch/ckpt/amp_RefVSR_small_MFID_00xxx.pytorch
    ...
```
Checkpoint path is [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/[mode]_00xxx.pytorch.
- PSNR is recorded in [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/checkpoint.txt.
- [LOG_OFFSET] can be modified with config.log_offset in ./configs/config.py.
- [mode] is the name of the model assigned with --mode in the script used for the pre-training stage.
Start the adaptation stage.
```
# Training the model 'Ours'.
$ ./scripts_train/train_RefVSR_MFID_8K.sh

# Training the model 'Ours-small'.
$ ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
```
For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small, we use Nvidia GeForce RTX 3090 (24GB) in practice.
Be sure to modify the script file to set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.
- We use the total batch size of 2, the multiplication of numbers in options --nproc_per_node and -b.

Training models with L1 loss

# To train the model 'Ours-l1':
$ ./scripts_train/train_RefVSR_L1.sh

# To train the model 'Ours-small-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

# To train the model 'Ours-IR-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

We use the total batch size of 8, the multiplication of numbers in options --nproc_per_node and -b.

Wiki

Contact

Open an issue for any inquiries. You may also have contact with [email protected]

License

This software is being made available under the terms in the LICENSE file. Any exemptions to these terms require a license from the Pohang University of Science and Technology.

Acknowledgment

We thank the authors of BasicVSR and DCSR for sharing their code.

BibTeX

@InProceedings{Lee2022RefVSR,
    author    = {Junyong Lee and Myeonghee Lee and Sunghyun Cho and Seungyong Lee},
    title     = {Reference-based Video Super-Resolution Using Multi-Camera Video Triplets},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022}
}

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Related tags

Overview

Reference-based Video Super-Resolution (RefVSR)Official PyTorch Implementation of the CVPR 2022 PaperProject | arXiv | RealMCVSR Dataset

Getting Started

Prerequisites

1. Environment setup

2. Dataset

3. Pre-trained models

Testing models of CVPR 2022

Evaluation script

Real-world 4x video super-resolution (HD to 8K resolution)

Quantitative evaluation (models trained with the pre-training stage)

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

The adaptation stage (Sec. 4.2)

Training models with L1 loss

Wiki

Contact

License

Acknowledgment

BibTeX

Owner

Junyong Lee

A set of tools for Namebase and HNS

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

SIMULEVAL A General Evaluation Toolkit for Simultaneous Translation

一个多模态内容理解算法框架，其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

A full-fledged version of Pix2Seq

The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Small little script to scrape, parse and check for active tor nodes. Can be used as proxies.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Convolutional Neural Networks

Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

the official implementation of the paper "Isometric Multi-Shape Matching" (CVPR 2021)

Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

A Unified Generative Framework for Various NER Subtasks.

使用yolov5训练自己数据集(详细过程)并通过flask部署

CoRe: Contrastive Recurrent State-Space Models

Kohei's 5th place solution for xview3 challenge

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

A Small and Easy approach to the BraTS2020 dataset (2D Segmentation)

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification

Reference-based Video Super-Resolution (RefVSR)
_{Official PyTorch Implementation of the CVPR 2022 Paper}
_{Project | arXiv | RealMCVSR Dataset}