source code of “Visual Saliency Transformer” (ICCV2021)

Related tags

Deep LearningVST
Overview

Visual Saliency Transformer (VST)

source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, and Ling Shao.

created by Ni Zhang, email: [email protected]

avatar

Requirement

  1. Pytorch 1.6.0
  2. Torchvison 0.7.0

RGB VST for RGB Salient Object Detection

Data Preparation

Training Set

We use the training set of DUTS to train our VST for RGB SOD. Besides, we follow Egnet to generate contour maps of DUTS trainset for training. You can directly download the generated contour maps DUTS-TR-Contour from [baidu pan fetch code: ow76 | Google drive] and put it into RGB_VST/Data folder.

Testing Set

We use the testing set of DUTS, ECSSD, HKU-IS, PASCAL-S, DUT-O, and SOD to test our VST. After Downloading, put them into RGB_VST/Data folder.

Your RGB_VST/Data folder should look like this:

-- Data
   |-- DUTS
   |   |-- DUTS-TR
   |   |-- | DUTS-TR-Image
   |   |-- | DUTS-TR-Mask
   |   |-- | DUTS-TR-Contour
   |   |-- DUTS-TE
   |   |-- | DUTS-TE-Image
   |   |-- | DUTS-TE-Mask
   |-- ECSSD
   |   |--images
   |   |--GT
   ...

Training, Testing, and Evaluation

  1. cd RGB_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB VST Model

  1. cd RGB_VST
  2. Download our pretrained RGB_VST.pth[baidu pan fetch code: pe54 | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: 92t0 | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: de4k | Google drive].

RGB-D VST for RGB-D Salient Object Detection

Data Preparation

Training Set

We use 1,485 images from NJUD, 700 images from NLPR, and 800 images from DUTLF-Depth to train our VST for RGB-D SOD. Besides, we follow Egnet to generate corresponding contour maps for training. You can directly download the whole training set from here [baidu pan fetch code: 7vsw | Google drive] and put it into RGBD_VST/Data folder.

Testing Set

NJUD [baidu pan fetch code: 7mrn | Google drive]
NLPR [baidu pan fetch code: tqqm | Google drive]
DUTLF-Depth [baidu pan fetch code: 9jac | Google drive]
STERE [baidu pan fetch code: 93hl | Google drive]
LFSD [baidu pan fetch code: l2g4 | Google drive]
RGBD135 [baidu pan fetch code: apzb | Google drive]
SSD [baidu pan fetch code: j3v0 | Google drive]
SIP [baidu pan fetch code: q0j5 | Google drive]
ReDWeb-S

After Downloading, put them into RGBD_VST/Data folder.

Your RGBD_VST/Data folder should look like this:

-- Data
   |-- NJUD
   |   |-- trainset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |   |-- | contour
   |   |-- testset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |-- STERE
   |   |-- RGB
   |   |-- depth
   |   |-- GT
   ...

Training, Testing, and Evaluation

  1. cd RGBD_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB-D VST Model

  1. cd RGBD_VST
  2. Download our pretrained RGBD_VST.pth[baidu pan fetch code: zt0v | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: jovk | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: i1we | Google drive].

Acknowledgement

We thank the authors of Egnet for providing codes of generating contour maps. We also thank Zhao Zhang for providing the efficient evaluation tool.

Citation

If you think our work is helpful, please cite

@inproceedings{liu2021VST, 
  title={Visual Saliency Transformer}, 
  author={Liu, Nian and Zhang, Ni and Han, Junwei and Shao, Ling},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}
Owner
Ni Zhang PhD student
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

Shreyas Srivastava 1 Oct 25, 2021
CMT: Convolutional Neural Networks Meet Vision Transformers

CMT: Convolutional Neural Networks Meet Vision Transformers [arxiv] 1. Introduction This repo is the CMT model which impelement with pytorch, no refer

FlyEgle 83 Dec 30, 2022
An offline deep reinforcement learning library

d3rlpy: An offline deep reinforcement learning library d3rlpy is an offline deep reinforcement learning library for practitioners and researchers. imp

Takuma Seno 817 Jan 02, 2023
nextPARS, a novel Illumina-based implementation of in-vitro parallel probing of RNA structures.

nextPARS, a novel Illumina-based implementation of in-vitro parallel probing of RNA structures. Here you will find the scripts necessary to produce th

Jesse Willis 0 Jan 20, 2022
CATE: Computation-aware Neural Architecture Encoding with Transformers

CATE: Computation-aware Neural Architecture Encoding with Transformers Code for paper: CATE: Computation-aware Neural Architecture Encoding with Trans

16 Dec 27, 2022
Cross-modal Deep Face Normals with Deactivable Skip Connections

Cross-modal Deep Face Normals with Deactivable Skip Connections Victoria Fernández Abrevaya*, Adnane Boukhayma*, Philip H. S. Torr, Edmond Boyer (*Equ

72 Nov 27, 2022
[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning This is the Tensorflow implementation of ICLR 2021 paper Rank the Episo

Daochen Zha 48 Nov 21, 2022
This repository contains the code for: RerrFact model for SciVer shared task

RerrFact This repository contains the code for: RerrFact model for SciVer shared task. Setup for Inference 1. Download SciFact database Download the S

Ashish Rana 1 May 22, 2022
Python implementation of "Elliptic Fourier Features of a Closed Contour"

PyEFD An Python/NumPy implementation of a method for approximating a contour with a Fourier series, as described in [1]. Installation pip install pyef

Henrik Blidh 71 Dec 09, 2022
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

NeoDTI NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).

62 Nov 26, 2022
Frigate - NVR With Realtime Object Detection for IP Cameras

A complete and local NVR designed for HomeAssistant with AI object detection. Uses OpenCV and Tensorflow to perform realtime object detection locally for IP cameras.

Blake Blackshear 6.4k Dec 31, 2022
Experiment about Deep Person Re-identification with EfficientNet-v2

We evaluated the baseline with Resnet50 and Efficienet-v2 without using pretrained models. Also Resnet50-IBN-A and Efficientnet-v2 using pretrained on ImageNet. We used two datasets: Market-1501 and

lan.nguyen2k 77 Jan 03, 2023
Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

Qintong Li 50 Dec 20, 2022
NeurIPS-2021: Neural Auto-Curricula in Two-Player Zero-Sum Games.

NAC Official PyTorch implementation of NAC from the paper: Neural Auto-Curricula in Two-Player Zero-Sum Games. We release code for: Gradient based ora

Xidong Feng 19 Nov 11, 2022
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 09, 2022
Official Pytorch implementation of Meta Internal Learning

Official Pytorch implementation of Meta Internal Learning

10 Aug 24, 2022
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

AutoViz and Auto_ViML 102 Dec 17, 2022
A computer vision pipeline to identify the "icons" in Christian paintings

Christian-Iconography A computer vision pipeline to identify the "icons" in Christian paintings. A bit about iconography. Iconography is related to id

Rishab Mudliar 3 Jul 30, 2022
Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification

This repo holds the codes of our paper: Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification, which is ac

Feng Gao 17 Dec 28, 2022