Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Overview

PGNet

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022,
CVPR 2022 (arXiv 2204.05041)

Abstract

Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions. To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research. Sufficient experiments on UHRSD and widely-used SOD datasets demonstrate that our method achieves superior performance compared to the state-of-the-art methods.

Ultra High-Resolution Saliency Detection Dataset

Visual display for sample in UHRSD dataset. Best viewd by clikcing and zooming in.

To relief the lack of high-resolution datasets for SOD, we contribute the Ultra High-Resolution for Saliency Detection (UHRSD) dataset with a total of 5,920 images in 4K(3840 × 2160) or higher resolution, including 4,932 images for training and 988 images for testing. A total of 5,920 images were manually selected from websites (e.g. Flickr Pixabay) with free copyright. Our dataset is diverse in terms of image scenes, with a balance of complex and simple salient objects of various size.

To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research.

  • Our UHRSD (Ultra High-Resolution Saliency Detection) Dataset:

We provide the original 4K version and the convenient 2K version of our UHRSD (Ultra High-Resolution Saliency Detection) Dataset for download: Google Drive

Usage

Requirements

  • Python 3.8
  • Pytorch 1.7.1
  • OpenCV
  • Numpy
  • Apex
  • Timm

Directory

The directory should be like this:

-- src 
-- model (saved model)
-- pre (pretrained model)
-- result (saliency maps)
-- data (train dataset and test dataset)
   |-- DUTS-TR+HR
   |   |-- image
   |   |-- mask
   |-- UHRSOD+HRSOD
   |   |--image
   |   |--mask
   ...
   

Train

cd src
./train.sh
  • We implement our method by PyTorch and conduct experiments on 2 NVIDIA 2080Ti GPUs.
  • We adopt pre-trained ResNet-18 and Swin-B-224 as backbone networks, which are saved in PRE folder.
  • We train our method on 3 settings : DUTS-TR, DUTS-TR+HRSOD and UHRSD_TR+HRSOD_TR.
  • After training, the trained models will be saved in MODEL folder.

Test

The trained model can be download here: Google Drive

cd src
python test.py
  • After testing, saliency maps will be saved in RESULT folder

Saliency Map

Trained on DUTS-TR:Google Drive

Trained on DUT+HRSOD:Google Drive

Trained on UHRSD+HRSOD:Google Drive

Citation

@inproceedings{xie2022pyramid,
    author    = {Xie, Chenxi and Xia, Changqun and Ma, Mingcan and Zhao, Zhirui and Chen, Xiaowu and Li, Jia},
    title     = {Pyramid Grafting Network for One-Stage High Resolution Saliency Detection},
    booktitle = {CVPR},
    year      = {2022}
}
Owner
CVTEAM
CVTEAM
PyTorch implementation of Super SloMo by Jiang et al.

Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun

Avinash Paliwal 2.9k Jan 03, 2023
Posterior temperature optimized Bayesian models for inverse problems in medical imaging

Posterior temperature optimized Bayesian models for inverse problems in medical imaging Max-Heinrich Laves*, Malte Tölle*, Alexander Schlaefer, Sandy

Artificial Intelligence in Cardiovascular Medicine (AICM) 6 Sep 19, 2022
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information This repository contains code, model, dataset for ChineseBERT at ACL2021. Ch

413 Dec 01, 2022
Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

Bayesian Methods Research Group 56 Nov 15, 2022
Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors,

24 Oct 26, 2022
Minecraft Hack Detection With Python

Minecraft Hack Detection An attempt to try and use crowd sourced replays to find

Kuleen Sasse 3 Mar 26, 2022
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023
IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling

IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling This is my code, data and approach for the IEEE-CIS Technical Challen

3 Sep 18, 2022
This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Official Pytorch Implementation for GLFC [CVPR-2022] Federated Class-Incremental Learning This is the official implementation code of our paper "Feder

Race Wang 57 Dec 27, 2022
UMich 500-Level Mobile Robotics Course

MOBILE ROBOTICS: METHODS & ALGORITHMS - WINTER 2022 University of Michigan - NA 568/EECS 568/ROB 530 For slides, lecture notes, and example codes, see

393 Dec 29, 2022
Convolutional Neural Network to detect deforestation in the Amazon Rainforest

Convolutional Neural Network to detect deforestation in the Amazon Rainforest This project is part of my final work as an Aerospace Engineering studen

5 Feb 17, 2022
Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Peter Schaldenbrand 247 Dec 23, 2022
Joint project of the duo Hacker Ninjas

Project Smoothie Společný projekt dua Hacker Ninjas. První pokus o hříčku po třech týdnech učení se programování. Jakub Kolář e:\

Jakub Kolář 2 Jan 07, 2022
Code for Paper Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning

Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning (c) Tianyu Han and Daniel Truhn, RWTH Aachen University, 20

Tianyu Han 7 Nov 22, 2022
WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"

BiRTE WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction" Requirements The main requirements are: py

9 Dec 27, 2022
Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

Light Field Networks Project Page | Paper | Data | Pretrained Models Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Dur

Vincent Sitzmann 130 Dec 29, 2022
[ICML 2021] A fast algorithm for fitting robust decision trees.

GROOT: Growing Robust Trees Growing Robust Trees (GROOT) is an algorithm that fits binary classification decision trees such that they are robust agai

Cyber Analytics Lab 17 Nov 21, 2022
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
Pure python implementations of popular ML algorithms.

Minimal ML algorithms This repo includes minimal implementations of popular ML algorithms using pure python and numpy. The purpose of these notebooks

Alexis Gidiotis 3 Jan 10, 2022
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

Xinyi Ying 28 Dec 15, 2022