Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Last update: Dec 11, 2022

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge by Teney et al.

Prerequisites

python 3.6+
numpy
pytorch 0.4
tqdm
nltk
pandas

Data

Preparation

To download and extract vqav2, glove, and pretrained visual features:
```
bash scripts/download_extract.sh
```
To prepare data for training:
```
python scripts/preproc.py
```

The structure of data/ directory should look like this:

- data/
  - zips/
    - v2_XXX...zip
    - ...
    - glove...zip
    - trainval_36.zip
  - glove/
    - glove...txt
    - ...
  - v2_XXX.json
  - ...
  - trainval_resnet...tsv
  (The above are files created after executing scripts/download_extract.sh)
  - tokenizers/
    - ...
  - dict_ans.pkl
  - dict_q.pkl
  - glove_pretrained_300.npy
  - train_qa.pkl
  - val_qa.pkl
  - train_vfeats.pkl
  - val_vfeats.pkl
  (The above are files created after executing scripts/preproc.py)

Train

Use default parameters:

bash scripts/train.sh

Notes

Huge re-factor (especially data preprocessing), tested based on pytorch 0.4.1 and python 3.6
Training for 20 epochs reach around 50% training accuracy. (model seems buggy in my implementation)
After all the preprocessing, data/ directory may be up to 38G+
Some of preproc.py and utils.py are based on this repo

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Related tags

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

Prerequisites

Data

Preparation

Train

Notes

Resources

Owner

Mark Dong

Search for documents in a domain through Google. The objective is to extract metadata

Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".

Unsupervised Language Model Pre-training for French

Textpipe: clean and extract metadata from text

Sequence Modeling with Structured State Spaces

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Search Git commits in natural language

Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Translation to python of Chris Sims' optimization function

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

Command Line Text-To-Speech using Google TTS

Just Another Telegram Ai Chat Bot Written In Python With Pyrogram.

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

Concept Modeling: Topic Modeling on Images and Text

An extensive UI tool built using new data scraped from BBC News

Simple Text-To-Speech Bot For Discord

VoiceFixer VoiceFixer is a framework for general speech restoration.

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.