Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Related tags

Deep LearningDCHN
Overview

2021-IEEE TCYB-DCHN

Peng Hu, Xi Peng, Hongyuan Zhu, Jie Lin, Liangli Zhen, Dezhong Peng, Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J]. IEEE Transactions on Cybernetics, vol. 51, no. 10, pp. 4982-4993, Oct. 2021. (PyTorch Code)

Abstract

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all views to learn a common Hamming space, thus making it difficult to handle the data with increasing views or a large number of views. To overcome these difficulties, we propose a decoupled CVH network (DCHN) approach which consists of a semantic hashing autoencoder module (SHAM) and multiple multiview hashing networks (MHNs). To be specific, SHAM adopts a hashing encoder and decoder to learn a discriminative Hamming space using either a few labels or the number of classes, that is, the so-called flexible inputs. After that, MHN independently projects all samples into the discriminative Hamming space that is treated as an alternative ground truth. In brief, the Hamming space is learned from the semantic space induced from the flexible inputs, which is further used to guide view-specific hashing in an independent fashion. Thanks to such an independent/decoupled paradigm, our method could enjoy high computational efficiency and the capacity of handling the increasing number of views by only using a few labels or the number of classes. For a newly coming view, we only need to add a view-specific network into our model and avoid retraining the entire model using the new and previous views. Extensive experiments are carried out on five widely used multiview databases compared with 15 state-of-the-art approaches. The results show that the proposed independent hashing paradigm is superior to the common joint ones while enjoying high efficiency and the capacity of handling newly coming views.

Framework

DCHN

Figure 1. Framework of the proposed DCHN method. g is the output of the corresponding view (i.e., image, text, video, etc.). o is the semantic hash code that is computed by the corresponding label y and semantic hashing transformation W. W is computed by the proposed semantic hashing autoencoder module (SHAM). sgn is an elementwise sign function. ℒR and ℒH are hash reconstruction and semantic hashing functions, respectively. In the training stage, first, W is used to recast the label y as a ground-truth hash code o. Then, the obtained hash code is used to guide view-specific networks with a semantic hashing reconstruction regularizer. Such a learning scheme makes the v view-specific neural networks (one network for each view) can be trained separately since they are decoupled and do not share any trainable parameters. Therefore, our DCHN can be easy to scale to a large number of views. In the inference stage, each trained view-specific network fk(xk, Θk) is used to compute the hash code of the sample xk.

SHAM

Figure 1. Proposed SHAM utilizes the semantic information (e.g., labels or classes) to learn an encoder W and a decoder WT by mutually converting the semantic and Hamming spaces. SHAM is one key component of our independent hashing paradigm.

Usage

First, to train SHAM wtih 64 bits on MIRFLICKR-25K, just run trainSHAM.py as follows:

python trainSHAM.py --datasets mirflickr25k --output_shape 64 --gama 1 --available_num 100

Then, to train a model for image modality wtih 64 bits on MIRFLICKR-25K, just run main_DCHN.py as follows:

python main_DCHN.py --mode train --epochs 100 --view 0 --datasets mirflickr25k --output_shape 64 --alpha 0.02 --gama 1 --available_num 100 --gpu_id 0

For text modality:

python main_DCHN.py --mode train --epochs 100 --view 1 --datasets mirflickr25k --output_shape 64 --alpha 0.02 --gama 1 --available_num 100 --gpu_id 1

To evaluate the trained models, you could run main_DCHN.py as follows:

python main_DCHN.py --mode eval --view -1 --datasets mirflickr25k --output_shape 64 --alpha 0.02 --gama 1 --available_num 100 --num_workers 0

Comparison with the State-of-the-Art

Table 1: Performance comparison in terms of MAP scores on the MIRFLICKR-25K and IAPR TC-12 datasets. The highest MAP score is shown in bold.

   Method    MIRFLICKR-25K IAPR TC-12
Image → Text Text → Image Image → Text Text → Image
16 32 64 128 16 32 64 128 16 32 64 128 16 32 64 128
Baseline 0.581 0.520 0.553 0.573 0.578 0.544 0.556 0.579 0.329 0.292 0.309 0.298 0.332 0.295 0.311 0.304
SePH [21] 0.729 0.738 0.744 0.750 0.753 0.762 0.764 0.769 0.467 0.476 0.486 0.493 0.463 0.475 0.485 0.492
SePHlr [12] 0.729 0.746 0.754 0.763 0.760 0.780 0.785 0.793 0.410 0.434 0.448 0.463 0.461 0.495 0.515 0.525
RoPH [34] 0.733 0.744 0.749 0.756 0.757 0.759 0.768 0.771 0.457 0.481 0.493 0.500 0.451 0.478 0.488 0.495
LSRH [22] 0.756 0.780 0.788 0.800 0.772 0.786 0.791 0.802 0.474 0.490 0.512 0.522 0.474 0.492 0.511 0.526
KDLFH [23] 0.734 0.755 0.770 0.771 0.764 0.780 0.794 0.797 0.306 0.314 0.351 0.357 0.307 0.315 0.350 0.356
DLFH [23] 0.721 0.743 0.760 0.767 0.761 0.788 0.805 0.810 0.306 0.314 0.326 0.340 0.305 0.315 0.333 0.353
MTFH [13] 0.581 0.571 0.645 0.543 0.584 0.556 0.633 0.531 0.303 0.303 0.307 0.300 0.303 0.303 0.308 0.302
DJSRH [14] 0.620 0.630 0.645 0.660 0.620 0.626 0.645 0.649 0.368 0.396 0.419 0.439 0.370 0.400 0.423 0.437
DCMH [9] 0.737 0.754 0.763 0.771 0.753 0.760 0.763 0.770 0.423 0.439 0.456 0.463 0.449 0.464 0.476 0.481
SSAH [20] 0.797 0.809 0.810 0.802 0.782 0.797 0.799 0.790 0.501 0.503 0.496 0.479 0.504 0.530 0.554 0.565
DCHN0 0.806 0.823 0.836 0.842 0.797 0.808 0.823 0.827 0.487 0.492 0.550 0.573 0.481 0.488 0.543 0.567
DCHN100 0.813 0.816 0.823 0.840 0.808 0.803 0.814 0.830 0.533 0.558 0.582 0.596 0.527 0.557 0.582 0.595

Table 2: Performance comparison in terms of MAP scores on the NUS-WIDE and MS-COCO datasets. The highest MAP score is shown in bold.

   Method    NUS-WIDE MS-COCO
Image → Text Text → Image Image → Text Text → Image
16 32 64 128 16 32 64 128 16 32 64 128 16 32 64 128
Baseline 0.281 0.337 0.263 0.341 0.299 0.339 0.276 0.346 0.362 0.336 0.332 0.373 0.348 0.341 0.347 0.359
SePH [21] 0.644 0.652 0.661 0.664 0.654 0.662 0.670 0.673 0.586 0.598 0.620 0.628 0.587 0.594 0.618 0.625
SePHlr [12] 0.607 0.624 0.644 0.651 0.630 0.649 0.665 0.672 0.527 0.571 0.592 0.600 0.555 0.596 0.618 0.621
RoPH [34] 0.638 0.656 0.662 0.669 0.645 0.665 0.671 0.677 0.592 0.634 0.649 0.657 0.587 0.628 0.643 0.652
LSRH [22] 0.622 0.650 0.659 0.690 0.600 0.662 0.685 0.692 0.580 0.563 0.561 0.567 0.580 0.611 0.615 0.632
KDLFH [23] 0.323 0.367 0.364 0.403 0.325 0.365 0.368 0.408 0.373 0.403 0.451 0.542 0.370 0.400 0.449 0.542
DLFH [23] 0.316 0.367 0.381 0.404 0.319 0.379 0.386 0.415 0.352 0.398 0.455 0.443 0.359 0.393 0.456 0.442
MTFH [13] 0.265 0.473 0.434 0.445 0.243 0.418 0.414 0.485 0.288 0.264 0.311 0.413 0.301 0.284 0.310 0.406
DJSRH [14] 0.433 0.453 0.467 0.442 0.457 0.468 0.468 0.501 0.478 0.520 0.544 0.566 0.462 0.525 0.550 0.567
DCMH [9] 0.569 0.595 0.612 0.621 0.548 0.573 0.585 0.592 0.548 0.575 0.607 0.625 0.568 0.595 0.643 0.664
SSAH [20] 0.636 0.636 0.637 0.510 0.653 0.676 0.683 0.682 0.550 0.577 0.576 0.581 0.552 0.578 0.578 0.669
DCHN0 0.648 0.660 0.669 0.683 0.662 0.677 0.685 0.697 0.602 0.658 0.682 0.706 0.591 0.652 0.669 0.696
DCHN100 0.654 0.671 0.681 0.691 0.668 0.683 0.697 0.707 0.662 0.701 0.703 0.720 0.650 0.689 0.693 0.714

Citation

If you find DCHN useful in your research, please consider citing:

@article{hu2021joint,
  author={Hu, Peng and Peng, Xi and Zhu, Hongyuan and Lin, Jie and Zhen, Liangli and Peng, Dezhong},
  journal={IEEE Transactions on Cybernetics}, 
  title={Joint Versus Independent Multiview Hashing for Cross-View Retrieval}, 
  year={2021},
  volume={51},
  number={10},
  pages={4982-4993},
  doi={10.1109/TCYB.2020.3027614}}
}
Owner
https://penghu-cs.github.io/
Amazing-Python-Scripts - 🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

📑 Introduction A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts. This is your Personal space to fin

Avinash Ranjan 1.1k Dec 29, 2022
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

44 Dec 12, 2022
pytorchのスライス代入操作をonnxに変換する際にScatterNDならないようにするサンプル

pytorch_remove_ScatterND pytorchのスライス代入操作をonnxに変換する際にScatterNDならないようにするサンプル。 スライスしたtensorにそのまま代入してしまうとScatterNDになるため、計算結果をcatで新しいtensorにする。 python ver

2 Dec 01, 2022
The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

Saxonica 11 Oct 23, 2022
Code to produce syntactic representations that can be used to study syntax processing in the human brain

Can fMRI reveal the representation of syntactic structure in the brain? The code base for our paper on understanding syntactic representations in the

Aniketh Janardhan Reddy 4 Dec 18, 2022
Simple image captioning model - CLIP prefix captioning.

Simple image captioning model - CLIP prefix captioning.

688 Jan 04, 2023
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

Facebook Research 373 Dec 31, 2022
Unofficial Implement PU-Transformer

PU-Transformer-pytorch Pytorch unofficial implementation of PU-Transformer (PU-Transformer: Point Cloud Upsampling Transformer) https://arxiv.org/abs/

Lee Hyung Jun 7 Sep 21, 2022
Blind Video Temporal Consistency via Deep Video Prior

deep-video-prior (DVP) Code for NeurIPS 2020 paper: Blind Video Temporal Consistency via Deep Video Prior PyTorch implementation | paper | project web

Chenyang LEI 272 Dec 21, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

0. Introduction This repository contains the source code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning". Notes The netwo

NetX Group 68 Nov 24, 2022
The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning" Setting up and using the repo Get the dataset. Follow

4 Apr 20, 2022
Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021)

Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021) In this repository we provide PyTorch implementations for GeMCL; a

4 Apr 15, 2022
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
Dataset Condensation with Contrastive Signals

Dataset Condensation with Contrastive Signals This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC). T

3 May 19, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022
TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University 326 Dec 13, 2022
SpinalNet: Deep Neural Network with Gradual Input

SpinalNet: Deep Neural Network with Gradual Input This repository contains scripts for training different variations of the SpinalNet and its counterp

H M Dipu Kabir 142 Dec 30, 2022
Tools for computational pathology

A toolkit for computational pathology and machine learning. View documentation Please cite our paper Installation There are several ways to install Pa

254 Dec 12, 2022
一个免费开源一键搭建的通用验证码识别平台,大部分常见的中英数验证码识别都没啥问题。

captcha_server 一个免费开源一键搭建的通用验证码识别平台,大部分常见的中英数验证码识别都没啥问题。 使用方法 python = 3.8 以上环境 pip install -r requirements.txt -i https://pypi.douban.com/simple gun

Sml2h3 189 Dec 02, 2022