This is the code for our paper "Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text"

Related tags

Deep Learningiconary
Overview

Iconary

This is the code for our paper "Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text". It includes the datasets, models we trained, and our training/evaluations scripts.

Install

Install python >= 3.6 and pytorch >= 1.7.0. This project has been tested with torch==1.7.1, but later versions might work.

Then install the extra requirements:

pip install -r requirements

Finally add the top-level directory to PYTHONPATH:

cd iconary
export PYTHONPATH=`pwd`

Data

Datasets will be downloaded and cached automatically as needed, file_paths.py shows where the files will be stored. By defaults, datasets are stored in ~/data/iconary.

If you want to download the data manually, the dataest can be downloaded here:

We release the complete datasets without held-out labels since computing the automatic metrics for both the Guesser and Drawer requires the entire game to be known. Models should only be trained on the train set and researchers should avoid looking/evaluating on the test sets as much as possible.

Models

We release the following models on S3:

Guesser:

  • TGuesser: s3://ai2-vision-iconary/public-models/tguesser-3b/
  • w/T5-Large: s3://ai2-vision-iconary/public-models/tguesser-large/
  • w/T5-Base: s3://ai2-vision-iconary/public-models/tguesser-base/

Drawer:

  • TDrawer: s3://ai2-vision-iconary/public-models/tdrawer-large/
  • w/T5-Base: s3://ai2-vision-iconary/public-models/tdrawer-base/

To use these models, download the entire directory. For example:

mkdir -p models
aws s3 cp --recursive s3://ai2-vision-iconary/public-models/tguesser-base models/tguesser-base

Train

Guesser

Train TGuesser with:

python iconary/experiments/train_guesser.py --pretrained_model t5-base --output_dir models/tguesser-base

Note our full model use --pretrained_model t5-b3, but that requries a >16GB RAM GPU to run.

Drawing

Train TDrawer with:

python iconary/experiments/train_drawer.py --pretrained_model t5-base --output_dir models/tdrawer-base --grad_accumulation 2

Note our full model use --pretrained_model t5-large, but that requires a >16GB RAM GPU to run.

Automatic Evaluation

These scripts generate drawings/guesses for games in human/human games, and computes automatic metrics from those drawings/guesses. Note our generation scripts will use all GPUs that they can find with torch.cuda.device_count(), to control where it runs use the CUDA_VISIBLE_DEVICES environment variable.

Guesser

To compute automatic metrics for the Guesser, first generate guesses as:

python iconary/experiments/generate_guesses.py path/to/model --dataset ood-valid --output_file guesses.json --unk_boost 2.0

Note that most of our evaluations are done using --unk_boost 2.0 which implements rare-word boosting.

This script will report our automatic metrics, but they can also be re-computed using:

python iconary/experiments/eval_guesses.py guesses.json

Drawer

Generate drawings with:

python iconary/experiments/generate_drawings.py path/to/model --dataset ood-valid --output_file drawings.json

This script will report our automatic metrics, but they can also be re-computed using:

python iconary/experiments/eval_drawings.py drawings.json

Human/AI Evaluation

Our code for running human/AI games is not currently released, if you are interested in running your own trials contact us and we can help you follow our human/AI setup.

Cite

If you use this work, please cite:

"Iconary: A Pictionary-Based Game for Testing MultimodalCommunication with Drawings and Text". Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi. In EMNLP 2021.

A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022
The ARCA23K baseline system

ARCA23K Baseline System This is the source code for the baseline system associated with the ARCA23K dataset. Details about ARCA23K and the baseline sy

4 Jul 02, 2022
Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example

Python Kafka reset consumergroup offset example This is a simple example of how

Willi Carlsen 1 Feb 16, 2022
T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time

T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time The first Lidar-only odometry framework with high performance based on tr

Pengwei Zhou 183 Dec 01, 2022
All of the figures and notebooks for my deep learning book, for free!

"Deep Learning - A Visual Approach" by Andrew Glassner This is the official repo for my book from No Starch Press. Ordering the book My book is called

Andrew Glassner 227 Jan 04, 2023
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these enviro

Google 1.5k Jan 02, 2023
This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

SimpleTrack This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking. We are still working on writing t

TuSimple 189 Dec 26, 2022
The repository contains source code and models to use PixelNet architecture used for various pixel-level tasks. More details can be accessed at .

PixelNet: Representation of the pixels, by the pixels, and for the pixels. We explore design principles for general pixel-level prediction problems, f

Aayush Bansal 196 Aug 10, 2022
Tesla Light Show xLights Guide With python

Tesla Light Show xLights Guide Welcome to the Tesla Light Show xLights guide! You can create and run your own light shows on Tesla vehicles. Running a

Tesla, Inc. 2.5k Dec 29, 2022
Permute Me Softly: Learning Soft Permutations for Graph Representations

Permute Me Softly: Learning Soft Permutations for Graph Representations

Giannis Nikolentzos 7 Jul 10, 2022
N-Person-Check-Checker-Splitter - A calculator app use to divide checks

N-Person-Check-Checker-Splitter This is my from-scratch programmed calculator ap

2 Feb 15, 2022
Jupyter notebooks for using & learning Keras

deep-learning-with-keras-notebooks 這個github的repository主要是個人在學習Keras的一些記錄及練習。希望在學習過程中發現到一些好的資訊與範例也可以對想要學習使用 Keras來解決問題的同好,或是對深度學習有興趣的在學學生可以有一些方便理解與上手範例

ErhWen Kuo 2.1k Dec 27, 2022
TransMorph: Transformer for Medical Image Registration

TransMorph: Transformer for Medical Image Registration keywords: Vision Transformer, Swin Transformer, convolutional neural networks, image registrati

Junyu Chen 180 Jan 07, 2023
This program creates a formatted excel file which highlights the undervalued stock according to Graham's number.

Over-and-Undervalued-Stocks Of Nepse Using Graham's Number Scrap the latest data using different websites and creates a formatted excel file that high

6 May 03, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

THUDM 540 Dec 30, 2022
Transformers based fully on MLPs

Awesome MLP-based Transformers papers An up-to-date list of Transformers based fully on MLPs without attention! Why this repo? After transformers and

Fawaz Sammani 35 Dec 30, 2022
[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

ArbSR Pytorch implementation of "Learning A Single Network for Scale-Arbitrary Super-Resolution", ICCV 2021 [Project] [arXiv] Highlights A plug-in mod

Longguang Wang 229 Dec 30, 2022
Pytorch implementation for M^3L

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification (CVPR 2021) Introduction This is the Py

Yuyang Zhao 45 Dec 26, 2022
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 04, 2023
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 05, 2023