Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Last update: Dec 21, 2022

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"

by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.

Prerequisites

Ubuntu 18
PyTorch 1.7.0
CUDA 10.1
Cudnn 7.5.1
Python 3.7
Numpy 1.17.3

Training

Please see launch_train.sh and launch_pretrain.sh for imagenet pretraining and sod training, respectively.

Testing

Please see launch_test.sh for testing on the sod benchmarks.

Main Results

Dataset	E_r	S_λ^mean	F_β^mean	M
DUT-RGBD	0.950	0.921	0.926	0.030
NJUD	0.923	0.903	0.901	0.039
NLPR	0.950	0.918	0.897	0.024
SSD	0.904	0.876	0.852	0.045
STEREO	0.933	0.904	0.898	0.036
LFSD	0.923	0.882	0.882	0.054
RGBD135	0.962	0.920	0.896	0.021

Saliency maps and Evaluation

All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).

You can use the toolbox provided by jiwei0921 for evaluation.

Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.

Dataset	E_r	S_λ^mean	F_β^mean	M
STERE-1000	0.928	0.897	0.895	0.038
SIP	0.908	0.861	0.868	0.057

Citation

@inproceedings{Sun2021DeepRS,
  title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
  author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
  journal={IEEE Conf. Comput. Vis. Pattern Recog.},
  year={2021}
}

License

The code is released under MIT License (see LICENSE file for details).

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Prerequisites

Training

Testing

Main Results

Saliency maps and Evaluation

Citation

License

Owner

如今我已剑指天涯

Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

MT-GAN-PyTorch - PyTorch Implementation of Learning to Transfer: Unsupervised Domain Translation via Meta-Learning

AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Face Recognition Attendance Project

The official PyTorch code implementation of "Personalized Trajectory Prediction via Distribution Discrimination" in ICCV 2021.

Code for Max-Margin Contrastive Learning - AAAI 2022

This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

Virtual Dance Reality Stage is a feature that offers you to share a stage with another user virtually.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling

A Japanese Medical Information Extraction Toolkit

adversarial_multi_armed_bandit_variable_plays

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Hooks for VCOCO

A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).