This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Last update: Aug 19, 2022

Overview

Code-and-Dataset-for-CapSal

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019. Paper link

Our code is implemented based on the Mask RCNN in Tensorflow and Keras. You can first install the maskrcnn according to the instruction or INSTALL.md.

COCO-CapSal Dataset

The COCO-CapSal dataset provides the saliency ground truth as well as the image captions for each image. It contains 5265 images for training and 1459 ones for validation. The annotations can be downloaded at BaiduYun or GoogleDrive. The folder 'capsal' contains the images, ground truth maps as well as the caprions (json file) of both training and validation sets.

Evaluation

For testing the CapSal model, first download the trained model at BaiduYun or Google ) and put it under the ./model. Run test_capsal.py to obtain the saliency maps of different datasets. The saliency map is avaliable at Google or BaiduYun.

Train

Run 'train.py'.

Citation

    @InProceedings{Zhang_2019_CVPR,
            author = {Zhang, Lu and Zhang, Jianming and Lin, Zhe and Lu, Huchuan and He, You},
            title = {CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection},
            booktitle = CVPR,
            year = {2019}}

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Related tags

Overview

Code-and-Dataset-for-CapSal

COCO-CapSal Dataset

Evaluation

Train

Citation

Owner

lu zhang

Lightweight Face Image Quality Assessment

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

It's like Shape Editor in Maya but works with skeletons (transforms).

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

YoloV3 Implemented in Tensorflow 2.0

Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

Trained on Simulated Data, Tested in the Real World

Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

Repository to run object detection on a model trained on an autonomous driving dataset.

A lightweight python AUTOmatic-arRAY library.

External Attention Network

UFPR-ADMR-v2 Dataset

Invariant Causal Prediction for Block MDPs

ObjDetApp deploys a pytorch model for object detection

Structural Constraints on Information Content in Human Brain States

Neural implicit reconstruction experiments for the Vector Neuron paper

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Related tags

Overview

Code-and-Dataset-for-CapSal

COCO-CapSal Dataset

Evaluation

Train

Citation

Owner

lu zhang

Lightweight Face Image Quality Assessment

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

It's like Shape Editor in Maya but works with skeletons (transforms).

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

YoloV3 Implemented in Tensorflow 2.0

Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

Trained on Simulated Data, Tested in the Real World

Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

Repository to run object detection on a model trained on an autonomous driving dataset.

A lightweight python AUTOmatic-arRAY library.

External Attention Network

UFPR-ADMR-v2 Dataset

Invariant Causal Prediction for Block MDPs

*ObjDetApp* deploys a pytorch model for object detection

Structural Constraints on Information Content in Human Brain States

Neural implicit reconstruction experiments for the Vector Neuron paper

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

ObjDetApp deploys a pytorch model for object detection