This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

Last update: Dec 09, 2022

Related tags

Overview

xGQA

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

xGQA builds on the original work of Hudson et al. 2019: GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering. The training data can be downloaded here.

Overview

The repository is structured as follows:

data/zero_shot/ contains the xGQA test-dev files for all 8 languages
data/few_shot/ contains the new standard splits for few shot learning. The number in the file name indicates how many distinct images the split includes. i.e. train_10.json implies that this subset contains questions about 10 distinct images.

Training Data

Please download the English training data of GQA (Hudson et al. 2019) here.

Zero-Shot Results

Zero-shot transfer results on xGQA when transferring from English GQA. Average accuracy is reported. Mean scores are not averaged over the source language (English).

model	en	de	pt	ru	id	bn	ko	zh	mean
M3P	58.43	23.93	24.37	20.37	22.57	15.83	16.90	18.60	20.37
OSCAR+Emb	62.23	17.35	19.25	10.52	18.26	14.93	17.10	16.41	16.26
OSCAR+Ada	60.30	18.91	27.02	17.50	18.77	15.42	15.28	14.96	18.27
mBERTAda	56.25	29.76	30.37	24.42	19.15	15.12	19.09	24.86	23.25

Few-Shot

Few-shot dataset sizes. The GQA test-dev set is split into new development, test sets, and training splits of different sizes. We maintain the distribution of structural types in each split.

Set	Test	Dev	Train
#Images	300	50	1	5	10	20	25	48
#Questions	9666	1422	27	155	317	594	704	1490

Citation

If you find this repository helpful, please cite our paper "xGQA: Cross-lingual Visual Question Answering":

@article{pfeiffer-etal-2021-xGQA,
    title={{xGQA: Cross-Lingual Visual Question Answering}},
    author={ Jonas Pfeiffer and Gregor Geigle and Aishwarya Kamath and Jan-Martin O. Steitz and Stefan Roth and Ivan Vuli{\'{c}} and Iryna Gurevych},
    journal = "arXiv preprint", 
    year = "2021",  
    url = "https://arxiv.org/pdf/2109.06082.pdf"
}

Shield:

This work is licensed under a Creative Commons Attribution 4.0 International License.

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

Related tags

Overview

xGQA

Overview

Training Data

Zero-Shot Results

Few-Shot

Citation

Owner

AdapterHub

Robotic Process Automation in Windows and Linux by using Driagrams.net BPMN diagrams.

The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

CM building dataset Timisoara

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

Contrastive Learning with Non-Semantic Negatives

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

1st place solution in CCF BDCI 2021 ULSEG challenge

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

Kindle is an easy model build package for PyTorch.

Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

Detector for Log4Shell exploitation attempts

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Kaggle | 9th place single model solution for TGS Salt Identification Challenge