The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Last update: Nov 13, 2021

Related tags

Deep Learning coda

Overview

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Overview

Code and dataset for The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color.

This repository is roughly split into 2 parts:

probing: The probing implementations, including code for generating CoDa.
mturk-survey: Instruction pages and used for crowdsourcing annotations.

How to use

Using CoDa

If you'd like to use CoDa, we highly recommend using the version hosted on the Huggingface Hub as it requires no additional dependencies.

from datasets import load_dataset

ds = load_dataset('corypaik/coda')

You can find more details about how to use Huggingface Datasets here.

Running experiments

This repository is developed and tested on linux systems and uses Bazel. If you are on other platforms, you might consider running Bazel in a docker container. If you'd like more guidance on this, please open an Issue on GitHub.

First, clone the project

# clone project
git clone https://github.com/nala-cub/coda

# goto project
cd coda

You can run the specific tasks as:

# run zeroshot
bazel run //projects/coda/probing/zeroshot
# representation probing
bazel run //projects/coda/probing/representations
# ngrams
bazel run //projects/coda/probing/ngram_stats
# generate dataset from annotations (relative to workspace root)
bazel run //projects/coda/probing/dataset:create_dataset -- \
  --coda_ds_export_dir=<export_dir>

To see help for any of the commands, use:

bazel run <target> -- --help
# for example:
# bazel run //projects/coda/probing/zeroshot -- --help

Annotation Instructions

Annotations were collected using an Angular app on Firebase. The included files contain all instructions, but not the app itself. If you're interested in the latter please open an issue on GitHub.

Citation

If this code was useful, please cite the paper:

@misc{paik2021world,
      title={The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color},
      author={Cory Paik and Stéphane Aroca-Ouellette and Alessandro Roncone and Katharina Kann},
      year={2021},
      eprint={2110.08182},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

CoDa is licensed under the Apache 2.0 license. The text of the license can be found here.

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Related tags

Overview

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Overview

How to use

Using CoDa

Running experiments

Annotation Instructions

Citation

License

Owner

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Simple implementation of Mobile-Former on Pytorch

Contenido del curso Bases de datos del DCC PUC versión 2021-2

Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.

gtfs2vec - Learning GTFS Embeddings for comparing PublicTransport Offer in Microregions

Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Generative Models for Graph-Based Protein Design

Kaggle: Cell Instance Segmentation

Causal-Adversarial-Instruments - PyTorch Implementation for Developing Library of Investigating Adversarial Examples on A Causal View by Instruments

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

ML model to classify between cats and dogs

Long Expressive Memory (LEM)

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching（CVPR2021）

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Official Implementation of SWAD (NeurIPS 2021)

Multiband spectro-radiometric satellite image analysis with K-means cluster algorithm

Software & Hardware to do multi color printing with Sharpies

gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images