Code for Greedy Gradient Ensemble for Visual Question Answering （ICCV 2021, Oral）

Last update: Jun 29, 2022

Related tags

Overview

Greedy Gradient Ensemble for De-biased VQA

Code release for "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). GGE can extend to other tasks with dataset biases.

@inproceedings{han2015greedy,
	title={Greedy Gradient Ensemble for Robust Visual Question Answering},
	author={Han, Xinzhe and Wang, Shuhui and Su, Chi and Huang, Qingming and Tian, Qi},
	booktitle={Proceedings of the IEEE international conference on computer vision},
	year={2021}
}

Prerequisites

We use Anaconda to manage our dependencies . You will need to execute the following steps to install all dependencies:

Edit the value for prefix variable in requirements.yml file, by assigning it the path to conda environment
Then, install all dependencies using: conda env create -f requirements.yml
Change to the new environment: bias

Data Setup

Download UpDn features from google drive into /data/detection_features folder
Download questions/answers for VQAv2 and VQA-CPv2 by executing bash tools/download.sh
Download visual cues/hints provided in A negative case analysis of visual grounding methods for VQA into data/hints. Note that we use caption based hints for grounding-based method reproduction, CGR and CGW.
Preprocess process the data with bash tools/process.sh

Training GGE

Run

CUDA_VISIBLE_DEVICES=0 python main.py --dataset cpv2 --mode MODE --debias gradient --topq 1 --topv -1 --qvp 5 --output []

to train a model. In main.py, import base_model for UpDn baseline; import base_model_ban as base_model for BAN baseline; import base_model_block as base_model for S-MRL baseline.

Set MODE as gge_iter and gge_tog for our best performance model; gge_d_bias and gge_q_bias for single bias ablation; base for baseline model.

Training ablations in Sec. 3 and Sec. 5

For models in Sec. 3, execute from train_ab import train and import base_model_ab as base_model in main.py. Run

CUDA_VISIBLE_DEVICES=0 python main.py --dataset cpv2 --mode MODE --debias METHODS --topq 1 --topv -1 --qvp 5 --output []

METHODS learned_mixin for LMH, MODE inv_sup for inv_sup strategy, v_inverse for inverse hint. Note that the results for HINT$_inv$ is obtained by running the code from A negative case analysis of visual grounding methods for VQA.

To test v_only model, import base_model_v_only as base_model in main.py.

To test RUBi and LMH+RUBi, run

CUDA_VISIBLE_DEVICES=0 python rubi_main.py --dataset cpv2 --mode MODE --output []

MODE updn is for RUBi, lmh_rubi is for LMH+RUBi.

Testing

For test stage, we output the overall Acc, CGR, CGW and CGD at threshold 0.2. change base_model to corresponding model in sensitivity.py and run

CUDA_VISIBLE_DEVICES=0 python sensitivity.py --dataset cpv2 --debias METHOD --load_checkpoint_path logs/your_path --output your_path

Visualization

We provide visualization in visualization.ipynb. If you want to see other visualization by yourself, download MS-COCO 2014 to data/images.

Acknowledgements

This repo uses features from A negative case analysis of visual grounding methods for VQA. Some codes are modified from CSS and UpDn.

Code for Greedy Gradient Ensemble for Visual Question Answering （ICCV 2021, Oral）

Related tags

Overview

Greedy Gradient Ensemble for De-biased VQA

Prerequisites

Data Setup

Training GGE

Training ablations in Sec. 3 and Sec. 5

Testing

Visualization

Acknowledgements

Owner

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

A crash course in six episodes for software developers who want to become machine learning practitioners.

PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Equivariant layers for RC-complement symmetry in DNA sequence data

Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

SOFT: Softmax-free Transformer with Linear Complexity, NeurIPS 2021 Spotlight

blind SQLIpy sebuah alat injeksi sql yang menggunakan waktu sql untuk mendapatkan sebuah server database.

FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021

Multispectral Object Detection with Yolov5

PyTorch Implementation of Region Similarity Representation Learning (ReSim)

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

OpenLT: An open-source project for long-tail classification

Synthetic LiDAR sequential point cloud dataset with point-wise annotations