This is the code for our paper "Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text"

Related tags

Deep Learningiconary
Overview

Iconary

This is the code for our paper "Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text". It includes the datasets, models we trained, and our training/evaluations scripts.

Install

Install python >= 3.6 and pytorch >= 1.7.0. This project has been tested with torch==1.7.1, but later versions might work.

Then install the extra requirements:

pip install -r requirements

Finally add the top-level directory to PYTHONPATH:

cd iconary
export PYTHONPATH=`pwd`

Data

Datasets will be downloaded and cached automatically as needed, file_paths.py shows where the files will be stored. By defaults, datasets are stored in ~/data/iconary.

If you want to download the data manually, the dataest can be downloaded here:

We release the complete datasets without held-out labels since computing the automatic metrics for both the Guesser and Drawer requires the entire game to be known. Models should only be trained on the train set and researchers should avoid looking/evaluating on the test sets as much as possible.

Models

We release the following models on S3:

Guesser:

  • TGuesser: s3://ai2-vision-iconary/public-models/tguesser-3b/
  • w/T5-Large: s3://ai2-vision-iconary/public-models/tguesser-large/
  • w/T5-Base: s3://ai2-vision-iconary/public-models/tguesser-base/

Drawer:

  • TDrawer: s3://ai2-vision-iconary/public-models/tdrawer-large/
  • w/T5-Base: s3://ai2-vision-iconary/public-models/tdrawer-base/

To use these models, download the entire directory. For example:

mkdir -p models
aws s3 cp --recursive s3://ai2-vision-iconary/public-models/tguesser-base models/tguesser-base

Train

Guesser

Train TGuesser with:

python iconary/experiments/train_guesser.py --pretrained_model t5-base --output_dir models/tguesser-base

Note our full model use --pretrained_model t5-b3, but that requries a >16GB RAM GPU to run.

Drawing

Train TDrawer with:

python iconary/experiments/train_drawer.py --pretrained_model t5-base --output_dir models/tdrawer-base --grad_accumulation 2

Note our full model use --pretrained_model t5-large, but that requires a >16GB RAM GPU to run.

Automatic Evaluation

These scripts generate drawings/guesses for games in human/human games, and computes automatic metrics from those drawings/guesses. Note our generation scripts will use all GPUs that they can find with torch.cuda.device_count(), to control where it runs use the CUDA_VISIBLE_DEVICES environment variable.

Guesser

To compute automatic metrics for the Guesser, first generate guesses as:

python iconary/experiments/generate_guesses.py path/to/model --dataset ood-valid --output_file guesses.json --unk_boost 2.0

Note that most of our evaluations are done using --unk_boost 2.0 which implements rare-word boosting.

This script will report our automatic metrics, but they can also be re-computed using:

python iconary/experiments/eval_guesses.py guesses.json

Drawer

Generate drawings with:

python iconary/experiments/generate_drawings.py path/to/model --dataset ood-valid --output_file drawings.json

This script will report our automatic metrics, but they can also be re-computed using:

python iconary/experiments/eval_drawings.py drawings.json

Human/AI Evaluation

Our code for running human/AI games is not currently released, if you are interested in running your own trials contact us and we can help you follow our human/AI setup.

Cite

If you use this work, please cite:

"Iconary: A Pictionary-Based Game for Testing MultimodalCommunication with Drawings and Text". Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi. In EMNLP 2021.

Official implementation of the NeurIPS 2021 paper Online Learning Of Neural Computations From Sparse Temporal Feedback

Online Learning Of Neural Computations From Sparse Temporal Feedback This repository is the official implementation of the NeurIPS 2021 paper Online L

Lukas Braun 3 Dec 15, 2021
Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing"

ProxyFL Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing" Authors: Shivam Kalra*, Junfeng Wen*, Jess

Layer6 Labs 14 Dec 06, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini!

ConversorDeMedidas_CapuccinoGelado Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini! Requirem

Arthur Ottoni Ribeiro 48 Nov 15, 2022
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 05, 2022
Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Self-supervised Image-to-text and Text-to-image Synthesis This is the official implementation of Self-supervised Image-to-text and Text-to-image Synth

6 Jul 31, 2022
Yggdrasil - A simplistic bot designed to streamline your server experience

Ygggdrasil A simplistic bot designed to streamline your server experience. Desig

Sntx_ 1 Dec 14, 2022
Code for the paper: Sketch Your Own GAN

Sketch Your Own GAN Project | Paper | Youtube | Slides Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to mat

677 Dec 28, 2022
Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

ASAP-Net This project implements ASAP-Net of paper ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation (BMVC2020). Overview We i

Hanwen Cao 26 Aug 25, 2022
RLHive: a framework designed to facilitate research in reinforcement learning.

RLHive is a framework designed to facilitate research in reinforcement learning. It provides the components necessary to run a full RL experiment, for both single agent and multi agent environments.

88 Jan 05, 2023
functorch is a prototype of JAX-like composable function transforms for PyTorch.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

Facebook Research 1.2k Jan 09, 2023
OMNIVORE is a single vision model for many different visual modalities

Omnivore: A Single Model for Many Visual Modalities [paper][website] OMNIVORE is a single vision model for many different visual modalities. It learns

Meta Research 451 Dec 27, 2022
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str

76 Nov 23, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

[ICLR'21] DARTS-: Robustly Stepping out of Performance Collapse Without Indicators [openreview] Authors: Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun

55 Nov 01, 2022
Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

OnsagerNet Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks This is the original pyTorch implemenati

Haijun.Yu 3 Aug 24, 2022
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Hierarchical Token Semantic Audio Transformer Introduction The Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound

Knut(Ke) Chen 134 Jan 01, 2023
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target i

NanYoMy 13 Oct 09, 2022
Physics-informed Neural Operator for Learning Partial Differential Equation

PINO Physics-informed Neural Operator for Learning Partial Differential Equation Abstract: Machine learning methods have recently shown promise in sol

107 Jan 02, 2023
PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos. By adopting a unified pipeline-ba

PyKale 370 Dec 27, 2022