The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Overview

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

This repository is the official implementation of CVPR 2021 paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Requirements

  • Tensorflow-1-15

Training

To train the NCE model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5

To train the NCE+Distill model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5 \
  --phrase_to_label_json=phrase_to_label.json

Evaluation

To evaluate the model on Flickr30K, run:

python eval_model.py \
  --region_feat_path=region_features_test.hdf5 \
  --phrase_feat_path=phrase_features_test.hdf5 \
  --glove_path=glove.hdf5 \
  --restore_path=checkpoint.meta

Pre-trained Models

You can download pretrained models using Res101 VG features here:

You can also find the features on Flickr30K test split here.

The pretrained models achieve the following performance on Flickr30K test split:

Model Name [email protected] [email protected] [email protected]
NCE+Distill 0.5310 0.7394 0.7875
NCE 0.5135 0.7338 0.7833

Citation

If you use our implementation in your research or wish to refer to the results published in our paper, please use the following BibTeX entry.

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Liwei and Huang, Jing and Li, Yin and Xu, Kun and Yang, Zhengyuan and Yu, Dong},
    title     = {Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {14090-14100}
}

SatelliteSfM - A library for solving the satellite structure from motion problem

Satellite Structure from Motion Maintained by Kai Zhang. Overview This is a libr

Kai Zhang 190 Dec 08, 2022
2D&3D human pose estimation

Human Pose Estimation Papers [CVPR 2016] - 201511 [IJCAI 2016] - 201602 Other Action Recognition with Joints-Pooled 3D Deep Convolutional Descriptors

133 Jan 02, 2023
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Repository for tackling Kaggle Ultrasound Nerve Segmentation challenge using Torchnet.

Ultrasound Nerve Segmentation Challenge using Torchnet This repository acts as a starting point for someone who wants to start with the kaggle ultraso

Qure.ai 46 Jul 18, 2022
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

Computer Vision and Geometry Lab 610 Jan 05, 2023
This repo includes the supplementary of our paper "CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels"

Supplementary Materials for CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels This repository includes all supplementary mater

Zhiwei Li 0 Jan 05, 2022
For visualizing the dair-v2x-i dataset

3D Detection & Tracking Viewer The project is based on hailanyi/3D-Detection-Tracking-Viewer and is modified, you can find the original version of the

34 Dec 29, 2022
Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

TDEER (WIP) Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEER is an e

Alipay 6 Dec 17, 2022
Library to enable Bayesian active learning in your research or labeling work.

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
tensorflow code for inverse face rendering

InverseFaceRender This is tensorflow code for our project: Learning Inverse Rendering of Faces from Real-world Videos. (https://arxiv.org/abs/2003.120

Yuda Qiu 18 Nov 16, 2022
Python implementation of ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images, AAAI2022.

ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images Binh M. Le & Simon S. Woo, "ADD:

2 Oct 24, 2022
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition | paper | dataset | pretrained detection model | Authors: Yi-Chang Che

Yi-Chang Chen 1 Aug 23, 2022
InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Jan 09, 2023
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
NeuralForecast is a Python library for time series forecasting with deep learning models

NeuralForecast is a Python library for time series forecasting with deep learning models. It includes benchmark datasets, data-loading utilities, evaluation functions, statistical tests, univariate m

Nixtla 1.1k Jan 03, 2023
Remote sensing change detection using PaddlePaddle

Change Detection Laboratory Developing and benchmarking deep learning-based remo

Lin Manhui 15 Sep 23, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
Attempt at implementation of a simple GAN using Keras

Simple GAN This is my attempt to make a wrapper class for a GAN in keras which can be used to abstract the whole architecture process. Simple GAN Over

Deven96 7 May 23, 2019
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022