Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Overview

PGNet

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022,
CVPR 2022 (arXiv 2204.05041)

Abstract

Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions. To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research. Sufficient experiments on UHRSD and widely-used SOD datasets demonstrate that our method achieves superior performance compared to the state-of-the-art methods.

Ultra High-Resolution Saliency Detection Dataset

Visual display for sample in UHRSD dataset. Best viewd by clikcing and zooming in.

To relief the lack of high-resolution datasets for SOD, we contribute the Ultra High-Resolution for Saliency Detection (UHRSD) dataset with a total of 5,920 images in 4K(3840 × 2160) or higher resolution, including 4,932 images for training and 988 images for testing. A total of 5,920 images were manually selected from websites (e.g. Flickr Pixabay) with free copyright. Our dataset is diverse in terms of image scenes, with a balance of complex and simple salient objects of various size.

To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research.

  • Our UHRSD (Ultra High-Resolution Saliency Detection) Dataset:

We provide the original 4K version and the convenient 2K version of our UHRSD (Ultra High-Resolution Saliency Detection) Dataset for download: Google Drive

Usage

Requirements

  • Python 3.8
  • Pytorch 1.7.1
  • OpenCV
  • Numpy
  • Apex
  • Timm

Directory

The directory should be like this:

-- src 
-- model (saved model)
-- pre (pretrained model)
-- result (saliency maps)
-- data (train dataset and test dataset)
   |-- DUTS-TR+HR
   |   |-- image
   |   |-- mask
   |-- UHRSOD+HRSOD
   |   |--image
   |   |--mask
   ...
   

Train

cd src
./train.sh
  • We implement our method by PyTorch and conduct experiments on 2 NVIDIA 2080Ti GPUs.
  • We adopt pre-trained ResNet-18 and Swin-B-224 as backbone networks, which are saved in PRE folder.
  • We train our method on 3 settings : DUTS-TR, DUTS-TR+HRSOD and UHRSD_TR+HRSOD_TR.
  • After training, the trained models will be saved in MODEL folder.

Test

The trained model can be download here: Google Drive

cd src
python test.py
  • After testing, saliency maps will be saved in RESULT folder

Saliency Map

Trained on DUTS-TR:Google Drive

Trained on DUT+HRSOD:Google Drive

Trained on UHRSD+HRSOD:Google Drive

Citation

@inproceedings{xie2022pyramid,
    author    = {Xie, Chenxi and Xia, Changqun and Ma, Mingcan and Zhao, Zhirui and Chen, Xiaowu and Li, Jia},
    title     = {Pyramid Grafting Network for One-Stage High Resolution Saliency Detection},
    booktitle = {CVPR},
    year      = {2022}
}
Owner
CVTEAM
CVTEAM
Source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals.

PatchGraph This repository contains the source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals. Installation Creat

Paloma Sodhi 11 Dec 15, 2022
PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

DECOR-GAN PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement, Zhiqin Chen, Vladimir G. Kim, Matthew Fish

Zhiqin Chen 72 Dec 31, 2022
Generative Models as a Data Source for Multiview Representation Learning

GenRep Project Page | Paper Generative Models as a Data Source for Multiview Representation Learning Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip

Ali 81 Dec 03, 2022
An AFL implementation with UnTracer (our coverage-guided tracer)

UnTracer-AFL This repository contains an implementation of our prototype coverage-guided tracing framework UnTracer in the popular coverage-guided fuz

113 Dec 17, 2022
Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

PanopticStudio Toolbox This repository has a toolbox to download, process, and visualize the Panoptic Studio (Panoptic) data. Note: Sep-21-2020: Curre

335 Jan 09, 2023
Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Introduction This repository is for Deep Learning agent of Starcraft2. It is very similar to AlphaStar of DeepMind except size of network. I only test

Dohyeong Kim 136 Jan 04, 2023
Deep Image Matting implementation in PyTorch

Deep Image Matting Deep Image Matting paper implementation in PyTorch. Differences "fc6" is dropped. Indices pooling. "fc6" is clumpy, over 100 millio

Yang Liu 724 Dec 27, 2022
A project for developing transformer-based models for clinical relation extraction

Clinical Relation Extration with Transformers Aim This package is developed for researchers easily to use state-of-the-art transformers models for ext

uf-hobi-informatics-lab 101 Dec 19, 2022
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings This is the repository for t

RegLab 39 Jan 07, 2023
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
CRISCE: Automatically Generating Critical Driving Scenarios From Car Accident Sketches

CRISCE: Automatically Generating Critical Driving Scenarios From Car Accident Sketches This document describes how to install and use CRISCE (CRItical

Chair of Software Engineering II, Uni Passau 2 Feb 09, 2022
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
GNN-based Recommendation Benchmark

GRecX A Fair Benchmark for GNN-based Recommendation Homepage and Documentation Homepage: Documentation: Paper: GRecX: An Efficient and Unified Benchma

73 Oct 17, 2022
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Rodrigo Arenas 36 Dec 12, 2022
Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)

U-GAT-IT — Official TensorFlow Implementation (ICLR 2020) : Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization fo

Junho Kim 6.2k Jan 04, 2023
A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Official Pytorch Implementation for GLFC [CVPR-2022] Federated Class-Incremental Learning This is the official implementation code of our paper "Feder

Race Wang 57 Dec 27, 2022