The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Overview

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

This repository is the official implementation of CVPR 2021 paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Requirements

  • Tensorflow-1-15

Training

To train the NCE model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5

To train the NCE+Distill model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5 \
  --phrase_to_label_json=phrase_to_label.json

Evaluation

To evaluate the model on Flickr30K, run:

python eval_model.py \
  --region_feat_path=region_features_test.hdf5 \
  --phrase_feat_path=phrase_features_test.hdf5 \
  --glove_path=glove.hdf5 \
  --restore_path=checkpoint.meta

Pre-trained Models

You can download pretrained models using Res101 VG features here:

You can also find the features on Flickr30K test split here.

The pretrained models achieve the following performance on Flickr30K test split:

Model Name [email protected] [email protected] [email protected]
NCE+Distill 0.5310 0.7394 0.7875
NCE 0.5135 0.7338 0.7833

Citation

If you use our implementation in your research or wish to refer to the results published in our paper, please use the following BibTeX entry.

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Liwei and Huang, Jing and Li, Yin and Xu, Kun and Yang, Zhengyuan and Yu, Dong},
    title     = {Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {14090-14100}
}

Neighborhood Contrastive Learning for Novel Class Discovery

Neighborhood Contrastive Learning for Novel Class Discovery This repository contains the official implementation of our paper: Neighborhood Contrastiv

Zhun Zhong 56 Dec 09, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023
A graphical Semi-automatic annotation tool based on labelImg and Yolov5

💕YOLOV5 semi-automatic annotation tool (Based on labelImg)

EricFang 247 Jan 05, 2023
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022
A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

Google Research 258 Dec 29, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 07, 2023
TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors This package provides a simulator for vision-based

Facebook Research 255 Dec 27, 2022
Capstone-Project-2 - A game program written in the Python language

Capstone-Project-2 My Pygame Game Information: Description This Pygame project i

Nhlakanipho Khulekani Hlophe 1 Jan 04, 2022
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

Michaël Fonder 76 Jan 03, 2023
State-of-the-art language models can match human performance on many tasks

Status: Archive (code is provided as-is, no updates expected) Grade School Math [Blog Post] [Paper] State-of-the-art language models can match human p

OpenAI 259 Jan 08, 2023
Tutel MoE: An Optimized Mixture-of-Experts Implementation

Project Tutel Tutel MoE: An Optimized Mixture-of-Experts Implementation. Supported Framework: Pytorch Supported GPUs: CUDA(fp32 + fp16), ROCm(fp32) Ho

Microsoft 344 Dec 29, 2022
Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Adversrial Machine Learning Benchmarks This code belongs to the papers: Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness? Det

Adversarial Machine Learning 9 Nov 27, 2022
PyTorch implementation for Convolutional Networks with Adaptive Inference Graphs

Convolutional Networks with Adaptive Inference Graphs (ConvNet-AIG) This repository contains a PyTorch implementation of the paper Convolutional Netwo

Andreas Veit 176 Dec 07, 2022
QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper)

QAHOI QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper) Requirements PyTorch = 1.5.1 torchvision = 0.6.1 pip install -r requ

38 Dec 29, 2022
A implemetation of the LRCN in mxnet

A implemetation of the LRCN in mxnet ##Abstract LRCN is a combination of CNN and RNN ##Installation Download UCF101 dataset ./avi2jpg.sh to split the

44 Aug 25, 2022
Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

How The New York Times can increase Engagement on Facebook Using machine learning to understand characteristics of news content that garners "high" Fa

Jessica Miles 0 Sep 16, 2021
Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning Code for the paper Harmonious Textual Layout Generation over Nat

7 Aug 09, 2022
codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

Eigenlearning This repo contains code for replicating the experiments of the paper A Theory of the Inductive Bias and Generalization of Kernel Regress

Jamie Simon 45 Dec 02, 2022
The Noise Contrastive Estimation for softmax output written in Pytorch

An NCE implementation in pytorch About NCE Noise Contrastive Estimation (NCE) is an approximation method that is used to work around the huge computat

Kaiyu Shi 287 Nov 25, 2022
An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

Transformer-in-Transformer An Implementation of the Transformer in Transformer paper by Han et al. for image classification, attention inside local pa

Rishit Dagli 40 Jul 25, 2022