Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

Overview

CMPC-Refseg

Code of our CVPR 2020 paper Referring Image Segmentation via Cross-Modal Progressive Comprehension.

Shaofei Huang*, Tianrui Hui*, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li (* Equal contribution)

Interpretation of CMPC.

  • (a) Input referring expression and image.

  • (b) The model first perceives all the entities described in the expression based on entity words and attribute words, e.g., “man” and “white frisbee” (orange masks and blue outline).

  • (c) After finding out all the candidate entities that may match with input expression, relational word “holding” can be further exploited to highlight the entity involved with the relationship (green arrow) and suppress the others which are not involved.

  • (d) Benefiting from the relation-aware reasoning process, the referred entity is found as the final prediction (purple mask). interpretation

Experimental Results

We modify the way of feature concatenation in the end of CMPC module and achieve higher performances than the results reported in our paper. New experimental results are summarized in the table bellow. You can download our trained checkpoints to test on the four datasets. The link to the checkpoints is: Baidu Drive, pswd: jjsf.

Method UNC val UNC testA UNC testB UNC+ val UNC+ testA UNC+ testB G-Ref val ReferIt test
STEP-ICCV19 [1] 60.04 63.46 57.97 48.19 52.33 40.41 46.40 64.13
Ours-CVPR20 61.36 64.53 59.64 49.56 53.44 43.23 49.05 65.53
Ours-Updated 62.47 65.08 60.82 50.25 54.04 43.47 49.89 65.58

Setup

We recommended the following dependencies.

  • Python 2.7
  • TensorFlow 1.5
  • Numpy
  • pydensecrf

This code is derived from RRN [2]. Please refer to it for more details of setup.

Data Preparation

  • Dataset Preprocessing

We conduct experiments on 4 datasets of referring image segmentation, including UNC, UNC+, Gref and ReferIt. After downloading these datasets, you can run the following commands for data preparation:

python build_batches.py -d Gref -t train
python build_batches.py -d Gref -t val
python build_batches.py -d unc -t train
python build_batches.py -d unc -t val
python build_batches.py -d unc -t testA
python build_batches.py -d unc -t testB
python build_batches.py -d unc+ -t train
python build_batches.py -d unc+ -t val
python build_batches.py -d unc+ -t testA
python build_batches.py -d unc+ -t testB
python build_batches.py -d referit -t trainval
python build_batches.py -d referit -t test
  • Glove Embedding

Download Gref_emb.npy and referit_emb.npy and put them in data/. We provide download link for Glove Embedding here: Baidu Drive, password: 2m28.

Training

Train on UNC training set with:

python -u trainval_model.py -m train -d unc -t train -n CMPC_model -emb -f ckpts/unc/cmpc_model

Testing

Test on UNC validation set with:

python -u trainval_model.py -m test -d unc -t val -n CMPC_model -i 700000 -c -emb -f ckpts/unc/cmpc_model

CMPC for video referring segmentation

We release video version code for CMPC on A2D dataset under CMPC_video/.

Reference

[1] Chen, Ding-Jie, et al. "See-through-text grouping for referring image segmentation." Proceedings of the IEEE International Conference on Computer Vision. 2019.

[2] Li, Ruiyu, et al. "Referring image segmentation via recurrent refinement networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

Citation

If our CMPC is useful to your research, please consider citing:

@inproceedings{huang2020referring,
  title={Referring Image Segmentation via Cross-Modal Progressive Comprehension},
  author={Huang, Shaofei and Hui, Tianrui and Liu, Si and Li, Guanbin and Wei, Yunchao and Han, Jizhong and Liu, Luoqi and Li, Bo},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10488--10497},
  year={2020}
}
Owner
spyflying
Two students of Cola Lab, BUAA.
spyflying
LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice,

LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and eval

Ahmet Erdem 691 Dec 23, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022
Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

The Power of Points for Modeling Humans in Clothing (ICCV 2021) This repository contains the official PyTorch implementation of the ICCV 2021 paper: T

Qianli Ma 158 Nov 24, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

RL_toolbox all the algorithm is running on pycharm IDE, or the package loss error may exist. implemented algorithm: trpo a3c a3c:for continous action

yupei.wu 44 Oct 10, 2022
Air Pollution Prediction System using Linear Regression and ANN

AirPollution Pollution Weather Prediction System: Smart Outdoor Pollution Monitoring and Prediction for Healthy Breathing and Living Publication Link:

Dr Sharnil Pandya, Associate Professor, Symbiosis International University 19 Feb 07, 2022
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

A Unified Framework for Parameter-Efficient Transfer Learning This is the official implementation of the paper: Towards a Unified View of Parameter-Ef

Junxian He 216 Dec 29, 2022
Topic Modelling for Humans

gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ

RARE Technologies 13.8k Jan 03, 2023
This is the repo for Uncertainty Quantification 360 Toolkit.

UQ360 The Uncertainty Quantification 360 (UQ360) toolkit is an open-source Python package that provides a diverse set of algorithms to quantify uncert

International Business Machines 207 Dec 30, 2022
Object detection, 3D detection, and pose estimation using center point detection:

Objects as Points Object detection, 3D detection, and pose estimation using center point detection: Objects as Points, Xingyi Zhou, Dequan Wang, Phili

Xingyi Zhou 6.7k Jan 03, 2023
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

datasketch: Big Data Looks Small datasketch gives you probabilistic data structures that can process and search very large amount of data super fast,

Eric Zhu 1.9k Jan 07, 2023
Generalized hybrid model for mode-locked laser diodes with an extended passive cavity

GenHybridMLLmodel Generalized hybrid model for mode-locked laser diodes with an extended passive cavity This hybrid simulation strategy combines a tra

Stijn Cuyvers 3 Sep 21, 2022
J.A.R.V.I.S is an AI virtual assistant made in python.

J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa

somePythonProgrammer 16 Dec 29, 2022
Controlling the MicriSpotAI robot from scratch

Project-MicroSpot-AI Controlling the MicriSpotAI robot from scratch Colaborators Alexander Dennis Components from MicroSpot The MicriSpotAI has the fo

Dennis Núñez-Fernández 5 Oct 20, 2022
[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

x-magical x-magical is a benchmark extension of MAGICAL specifically geared towards cross-embodiment imitation. The tasks still provide the Demo/Test

Kevin Zakka 36 Nov 26, 2022
Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021) Alexey Nekrasov*, Jonas Schult*, Or Litany, Bastian Leibe, Francis Engelmann Mix3D is

Alexey Nekrasov 189 Dec 26, 2022
Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

NCAE_UAD Official project repository of 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination' Abstract In this p

Jongmin Andrew Yu 2 Feb 10, 2022
A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

ViZDoom http://vizdoom.cs.put.edu.pl ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is pri

Hyeonwoo Noh 1 Aug 19, 2020
Using image super resolution models with vapoursynth and speeding them up with TensorRT

vs-RealEsrganAnime-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Also a docker image since

4 Aug 23, 2022
Kalidokit is a blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models

Blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models.

Rich 4.5k Jan 07, 2023