TRIQ implementation

Overview

TRIQ Implementation

TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment.

Installation

  1. Clone this repository.
  2. Install required Python packages. The code is developed by PyCharm in Python 3.7. The requirements.txt document is generated by PyCharm, and the code should also be run in latest versions of the packages.

Training a model

An example of training TRIQ can be seen in train/train_triq.py. Argparser should be used, but the authors prefer to use dictionary with parameters being defined. It is easy to convert to take arguments. In principle, the following parameters can be defined:

args = {}
args['multi_gpu'] = 0 # gpu setting, set to 1 for using multiple GPUs
args['gpu'] = 0  # If having multiple GPUs, specify which GPU to use

args['result_folder'] = r'..\databases\experiments' # Define result path
args['n_quality_levels'] = 5  # Choose between 1 (MOS prediction) and 5 (distribution prediction)

args['transformer_params'] = [2, 32, 8, 64]

args['train_folders'] =  # Define folders containing training images
    [
    r'..\databases\train\koniq_normal',
    r'..\databases\train\koniq_small',
    r'..\databases\train\live'
    ]
args['val_folders'] =  # Define folders containing testing images
    [
    r'..\databases\val\koniq_normal',
    r'..\databases\val\koniq_small',
    r'..\databases\val\live'
    ]
args['koniq_mos_file'] = r'..\databases\koniq10k_images_scores.csv'  # MOS (distribution of scores) file for KonIQ database
args['live_mos_file'] = r'..\databases\live_mos.csv'   # MOS (standard distribution of scores) file for LIVE-wild database

args['backbone'] = 'resnet50' # Choose from ['resnet50', 'vgg16']
args['weights'] = r'...\pretrained_weights\resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'  # Define the path of ImageNet pretrained weights
args['initial_epoch'] = 0  # Define initial epoch for use in fine-tune

args['lr_base'] = 1e-4 / 2  # Define the back learning rate in warmup and rate decay approach
args['lr_schedule'] = True  # Choose between True and False, indicating if learning rate schedule should be used or not
args['batch_size'] = 32  # Batch size, should choose to fit in the GPU memory
args['epochs'] = 120  # Maximal epoch number, can set early stop in the callback or not

args['image_aug'] = True # Choose between True and False, indicating if image augmentation should be used or not

Predict image quality using the trained model

After TRIQ has been trained, and the weights have been stored in h5 file, it can be used to predict image quality with arbitrary sizes,

    args = {}
    args['n_quality_levels'] = 5
    args['backbone'] = 'resnet50'
    args['weights'] = r'..\\TRIQ.h5'
    model = create_triq_model(n_quality_levels=args['n_quality_levels'],
                              backbone=args['backbone'],])
    model.load_weights(args['weights'])

And then use ModelEvaluation to predict quality of image set.

In the "examples" folder, an example script examples\image_quality_prediction.py is provided to use the trained weights to predict quality of example images. In the "train" folder, an example script train\validation.py is provided to use the trained weights to predict quality of images in folders.

A potential issue is image shape mismatch. For example, if an image is too large, then line 146 in transformer_iqa.py should be changed to increase the pooling size. For example, it can be changed to self.pooling_small = MaxPool2D(pool_size=(4, 4)) or even larger.

Prepare datasets for model training

This work uses two publicly available databases: KonIQ-10k KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment by V. Hosu, H. Lin, T. Sziranyi, and D. Saupe; and LIVE-wild Massive online crowdsourced study of subjective and objective picture quality by D. Ghadiyaram, and A.C. Bovik

  1. The two databases were merged, and then split to training and testing sets. Please see README in databases for details.

  2. Make MOS files (note: do NOT include head line):

    For database with score distribution available, the MOS file is like this (koniq format):

        image path, voter number of quality scale 1, voter number of quality scale 2, voter number of quality scale 3, voter number of quality scale 4, voter number of quality scale 5, MOS or Z-score
        10004473376.jpg,0,0,25,73,7,3.828571429
        10007357496.jpg,0,3,45,47,1,3.479166667
        10007903636.jpg,1,0,20,73,2,3.78125
        10009096245.jpg,0,0,21,75,13,3.926605505
    

    For database with standard deviation available, the MOS file is like this (live format):

        image path, standard deviation, MOS or Z-score
        t1.bmp,18.3762,63.9634
        t2.bmp,13.6514,25.3353
        t3.bmp,18.9246,48.9366
        t4.bmp,18.2414,35.8863
    

    The format of MOS file ('koniq' or 'live') and the format of MOS or Z-score ('mos' or 'z_score') should also be specified in misc/imageset_handler/get_image_scores.

  3. In the train script in train/train_triq.py the folders containing training and testing images are provided.

  4. Pretrained ImageNet weights can be downloaded (see README in.\pretrained_weights) and pointed to in the train script.

Trained TRIQ weights

TRIQ has been trained on KonIQ-10k and LIVE-wild databases, and the weights file can be downloaded here.

State-of-the-art models

Other three models are also included in the work. The original implementations of metrics are employed, and they can be found below.

Koncept512 KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment

SGDNet SGDNet: An end-to-end saliency-guided deep neural network for no-reference image quality assessment

CaHDC End-to-end blind image quality prediction with cascaded deep neural network

Comparison results

We have conducted several experiments to evaluate the performance of TRIQ, please see results.pdf for detailed results.

Error report

In case errors/exceptions are encountered, please first check all the paths. After fixing the path isse, please report any errors in Issues.

FAQ

  • To be added

ViT (Vision Transformer) for IQA

This work is heavily inspired by ViT An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. The module vit_iqa contains implementation of ViT for IQA, and mainly followed the implementation of ViT-PyTorch. Pretrained ViT weights can be downloaded here.

Owner
Junyong You
Junyong You
Example for AUAV 2022 with obstacle avoidance.

AUAV 2022 Sample This is a sample PX4 based quadrotor path planning framework based on Ubuntu 20.04 and ROS noetic for the IEEE Autonomous UAS 2022 co

James Goppert 11 Sep 16, 2022
Unet network with mean teacher for altrasound image segmentation

Unet network with mean teacher for altrasound image segmentation

5 Nov 21, 2022
SmoothGrad implementation in PyTorch

SmoothGrad implementation in PyTorch PyTorch implementation of SmoothGrad: removing noise by adding noise. Vanilla Gradients SmoothGrad Guided backpro

SSKH 143 Jan 05, 2023
A Model for Natural Language Attack on Text Classification and Inference

TextFooler A Model for Natural Language Attack on Text Classification and Inference This is the source code for the paper: Jin, Di, et al. "Is BERT Re

Di Jin 418 Dec 16, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Multitask Learning Strengthens Adversarial Robustness

Multitask Learning Strengthens Adversarial Robustness

Columbia University 15 Jun 10, 2022
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 46 Dec 26, 2022
Code for "Long-tailed Distribution Adaptation"

Long-tailed Distribution Adaptation (Accepted in ACM MM2021) This project is built upon BBN. Installation pip install -r requirements.txt Usage Traini

Zhiliang Peng 10 May 18, 2022
KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

IELab@ Korea University 74 Dec 28, 2022
Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

Fadi Boutros 51 Dec 13, 2022
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting Project Page | YouTube | Paper This is the official PyTorch implementation of the C

Zhuoqian Yang 330 Dec 11, 2022
MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios

MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios This is the official TensorFlow implementation of MetaTTE in the

morningstarwang 4 Dec 14, 2022
Python project to take sound as input and output as RGB + Brightness values suitable for DMX

sound-to-light Python project to take sound as input and output as RGB + Brightness values suitable for DMX Current goals: Get one pixel working: Vary

Bobby Cox 1 Nov 17, 2021
EMNLP 2021 Findings' paper, SCICAP: Generating Captions for Scientific Figures

SCICAP: Scientific Figures Dataset This is the Github repo of the EMNLP 2021 Findings' paper, SCICAP: Generating Captions for Scientific Figures (Hsu

Edward 26 Nov 21, 2022
[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

[Project] [PDF] This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets" by Axel Sauer, Katja

742 Jan 04, 2023
Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Christopher T. Chubb 35 Dec 21, 2022
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.

Paragraph Aggregation Retrieval Model (PARM) for Dense Document-to-Document Retrieval This repository contains the code for the paper PARM: A Paragrap

Sophia Althammer 33 Aug 26, 2022
A PaddlePaddle version image model zoo.

Paddle-Image-Models English | 简体中文 A PaddlePaddle version image model zoo. Install Package Install by pip: $ pip install ppim Install by wheel package

AgentMaker 131 Dec 07, 2022
Code for LIGA-Stereo Detector, ICCV'21

LIGA-Stereo Introduction This is the official implementation of the paper LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based

Xiaoyang Guo 75 Dec 09, 2022
Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

CMaskTrack R-CNN for OVIS This repo serves as the official code release of the CMaskTrack R-CNN model on the Occluded Video Instance Segmentation data

Q . J . Y 61 Nov 25, 2022