Code and Experiments for ACL-IJCNLP 2021 Paper Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering.

Overview

Mind Your Outliers!

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering
Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher D. Manning
Annual Meeting for the Association of Computational Linguistics (ACL-IJCNLP) 2021.

Code & Experiments for training various models and performing active learning on a variety of different VQA datasets and splits. Additional code for creating and visualizing dataset maps, for qualitative analysis!

If there are any trained models you want access to that aren't easy for you to train, please let me know and I will do my best to get them to you. Unfortunately finding a hosting solution for 1.8TB of checkpoints hasn't been easy 😅 .


Quickstart

Clones vqa-outliers to the current working directory, then walks through dependency setup, mostly leveraging the environments/environment- files. Assumes conda is installed locally (and is on your path!). Follow the directions here to install conda (Anaconda or Miniconda) if not.

We provide two installation directions -- one set of instructions for CUDA-equipped machines running Linux w/ GPUs (for training), and another for CPU-only machines (e.g., MacOS, Linux) geared towards local development and in case GPUs are not available.

The existing GPU YAML File is geared for CUDA 11.0 -- if you have older GPUs, file an issue, and I'll create an appropriate conda configuration!

Setup Instructions

# Clone `vqa-outliers` Repository and run Conda Setup
git clone https://github.com/siddk/vqa-outliers.git
cd vqa-outliers

# Ensure you're using the appropriate hardware config!
conda env create -f environments/environment-{cpu, gpu}.yaml
conda activate vqa-outliers

Usage

The following section walks through downloading all the necessary data (be warned -- it's a lot!) and running both the various active learning strategies on the given VQA datasets, as well as the code for generating Dataset Maps over the full dataset, and visualizing active learning acquisitions relative to those maps.

Note: This is going to require several hundred GB of disk space -- for targeted experiments, feel free to file an issue and I can point you to what you need!

Downloading Data

We have dependencies on a few datasets, some pretrained word vectors (GloVe), and a pretrained multimodal model (LXMERT), though not the one commonly released in HuggingFace Transformers. To download all dependencies, use the following commands from the root of this repository (in general, run everything from repository root!).

# Note: All the following will create/write to the directory data/ in the current repository -- feel free to change!

# GloVe Vectors
./scripts/download/glove.sh

# Download LXMERT Checkpoint (no-QA Pretraining)
./scripts/download/lxmert.sh

# Download VQA-2 Dataset (Entire Thing -- Questions, Raw Images, BottomUp Object Features)!
./scripts/download/vqa2.sh

# Download GQA Dataset (Entire Thing -- Questions, Raw Images, BottomUp Object Features)!
./scripts/download/gqa.sh

Additional Preprocessing

Many of the models we evaluate in this work use the object-based BottomUp-TopDown Attention Features -- however, our Grid Logistic Regression and LSTM-CNN Baseline both use dense ResNet-101 Features of the images. We extract these from the raw images ourselves as follows (again, this will take a ton of disk space):

# Note: GPU Recommended for Faster Extraction

# Extract VQA-2 Grid Features
python scripts/extract.py --dataset vqa2 --images data/VQA-Images --spatial data/VQA-Spatials

# Extract GQA Grid Features
python scripts/extract.py --dataset gqa --images data/GQA-Images --spatial data/GQA-Spatials

Running Active Learning

Running Active Learning is a simple matter of using the script active.py in the root of this directory. This script is able to reproduce every experiment from the paper, and allows you to specify the following:

  • Dataset in < vqa2 | gqa >
  • Split in < all | sports | food > (for VQA-2) and all for GQA
  • Model (mode) in < glreg | olreg | cnn | butd | lxmert > (Both Logistic Regression Models, LSTM-CNN, BottomUp-TopDown, and LXMERT, respectively)
  • Active Learning Strategy in < baseline | least-conf | entropy | mc-entropy | mc-bald | coreset-{fused, language, vision} > following the paper.
  • Size of Seed Set (burn, for burn-in) in < p05 | p10 | p25 | p50 > where each denotes percentage of full-dataset to use as seed set.

For example, to run the BottomUp-TopDown Attention Model (butd) with the VQA-2 Sports Dataset, with Bayesian Active Learning by Disagreement, with a seed set that's 10% the size of the original dataset, use the following:

# Note: If GPU available (recommended), pass --gpus 1 as well!
python active.py --dataset vqa2 --split sports --mode butd --burn p10 --strategy mc-bald

File an issue if you run into trouble!

Creating Dataset Maps

Creating a Dataset Map entails training a model on an entire dataset, while maintaining statistics on a per-example basis, over the course of training. To train models and dump these statistics, use the top-level file cartograph.py as follows (again, for the BottomUp-TopDown Model, on VQA2-Sports):

python cartograph.py --dataset vqa2 --split sports --mode butd

Once you've trained a model and generated the necessary statistics, you can plot the corresponding map using the top-level file chart.py as follows:

# Note: `map` mode only generates the dataset map... to generate acquisition plots, see below!
python chart.py --mode map --dataset vqa2 --split sports --model butd

Note that Dataset Maps are generated per-dataset, per-model!

Visualizing Acquisitions

To visualize the acquisitions of a given active learning strategy relative to a given dataset map (the bar graphs from our paper), you can run the following (again, with our running example, but works for any combination):

python chart.py --mode acquisitions --dataset vqa2 --split sports --model butd --burn p10 --strategies mc-bald

Note that the script chart.py defaults to plotting acquisitions for all active learning strategies -- either make sure to run these out for the configuration you want, or provide the appropriate arguments!

Ablating Outliers

Finally, to run the Outlier Ablation experiments for a given model/active learning strategy, take the following steps:

  • Identify the different "frontiers" of examples (different difficulty classes) by using scripts/frontier.py
  • Once this file has been generated, run active.py with the special flag --dataset vqa2-frontier and the arbitrary strategies you care about.
  • Sit back, examine the results, and get excited!

Concretely, you can generate the frontier files for a BottomUp-TopDown Attention Model as follows:

python scripts/frontier.py --model butd

Any other model would also work -- just make sure you've generated the map via cartograph.py first!


Results

We present the full set of results from the paper (and the additional results from the supplement) in the visualizations/ directory. The sub-directory active-learning shows performance vs. samples for various splits of strategies (visualizing all on the same plot is a bit taxing), while the sub-directory acquisitions has both the dataset maps and corresponding acquisitions per strategy!


Start-Up (from Scratch)

Use these commands if you're starting a repository from scratch (this shouldn't be necessary to use/build off of this code, but I like to keep this in the README in case things break in the future). Generally, you should be fine with the "Usage" section above!

Linux w/ GPU & CUDA 11.0

# Create Python Environment (assumes Anaconda -- replace with package manager of choice!)
conda create --name vqa-outliers python=3.8
conda activate vqa-outliers
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
conda install ipython jupyter
conda install pytorch-lightning -c conda-forge

pip install typed-argument-parser h5py opencv-python matplotlib annoy seaborn spacy scipy transformers scikit-learn

Mac OS & Linux (CPU)

# Create Python Environment (assumes Anaconda -- replace with package manager of choice!)
conda create --name vqa-outliers python=3.8
conda activate vqa-outliers
conda install pytorch torchvision torchaudio -c pytorch
conda install ipython jupyter
conda install pytorch-lightning -c conda-forge

pip install typed-argument-parser h5py opencv-python matplotlib annoy seaborn spacy scipy transformers scikit-learn

Note

We are committed to maintaining this repository for the community. We did port this code up to latest versions of PyTorch-Lightning and PyTorch, so there may be small incompatibilities we didn't catch when testing -- please feel free to open an issue if you run into problems, and I will respond within 24 hours. If urgent, please shoot me an email at [email protected] with "VQA-Outliers Code" in the Subject line and I'll be happy to help!

Owner
Sidd Karamcheti
PhD Student at Stanford & Research Intern at Hugging Face 🤗
Sidd Karamcheti
Robust fine-tuning of zero-shot models

Robust fine-tuning of zero-shot models This repository contains code for the paper Robust fine-tuning of zero-shot models by Mitchell Wortsman*, Gabri

224 Dec 29, 2022
Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

Hanzhe Hu 99 Dec 12, 2022
Versatile Generative Language Model

Versatile Generative Language Model This is the implementation of the paper: Exploring Versatile Generative Language Model Via Parameter-Efficient Tra

Zhaojiang Lin 17 Dec 02, 2022
用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本和PARL(paddle)版本

用强化学习玩合成大西瓜 代码地址:https://github.com/Sharpiless/play-daxigua-using-Reinforcement-Learning 用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本、PARL(paddle)版本和pytorch版本

72 Dec 17, 2022
Code for one-stage adaptive set-based HOI detector AS-Net.

AS-Net Code for one-stage adaptive set-based HOI detector AS-Net. Mingfei Chen*, Yue Liao*, Si Liu, Zhiyuan Chen, Fei Wang, Chen Qian. "Reformulating

Mingfei Chen 45 Dec 09, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
Implementation of Sequence Generative Adversarial Nets with Policy Gradient

SeqGAN Requirements: Tensorflow r1.0.1 Python 2.7 CUDA 7.5+ (For GPU) Introduction Apply Generative Adversarial Nets to generating sequences of discre

Lantao Yu 2k Dec 29, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

130 Dec 11, 2022
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
Deeprl - Standard DQN and dueling network for simple games

DeepRL This code implements the standard deep Q-learning and dueling network with experience replay (memory buffer) for playing simple games. DQN algo

Yao Zhou 6 Apr 12, 2020
HyperPose is a library for building high-performance custom pose estimation applications.

HyperPose is a library for building high-performance custom pose estimation applications.

TensorLayer Community 1.2k Jan 04, 2023
[CVPR'22] COAP: Learning Compositional Occupancy of People

COAP: Compositional Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2022 paper COAP: Lear

Marko Mihajlovic 111 Dec 11, 2022
A script depending on VASP output for calculating Fermi-Softness.

Fermi softness calculation for Vienna Ab initio Simulation Package (VASP) Update 1.1.0: Big update: Rewrote the code. Use Bader atomic division instea

qslin 11 Nov 08, 2022
Mmrotate - OpenMMLab Rotated Object Detection Benchmark

OpenMMLab website HOT OpenMMLab platform TRY IT OUT 📘 Documentation | 🛠️ Insta

OpenMMLab 1.2k Jan 04, 2023
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

Output Diversified Sampling (ODS) This is the github repository for the NeurIPS 2020 paper "Diversity can be Transferred: Output Diversification for W

50 Dec 11, 2022
EmoTag helps you train emotion detection model for Chinese audios

emoTag emoTag helps you train emotion detection model for Chinese audios. Environment pip install -r requirement.txt Data We used Emotional Speech Dat

_zza 4 Sep 07, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022
Research on controller area network Intrusion Detection Systems

Group members information Member 1: Lixue Liang Member 2: Yuet Lee Chan Member 3: Xinruo Zhang Member 4: Yifei Han User Manual Generate Attack Packets

Roche 4 Aug 30, 2022
An official implementation of "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation" (CVPR 2021) in PyTorch.

BANA This is the implementation of the paper "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation". For more inf

CV Lab @ Yonsei University 59 Dec 12, 2022