COPA-SSE contains crowdsourced explanations for the Balanced COPA dataset

Related tags

Deep Learningcopa-sse
Overview

COPA-SSE

Repository for COPA-SSE: Semi-Structured Explanations for Commonsense Reasoning.

Crowdsourcing protocol

COPA-SSE contains crowdsourced explanations for the Balanced COPA dataset, a variant of the Choice of Plausible Alternatives (COPA) benchmark. The explanations are formatted as a set of triple-like common sense statements with ConceptNet relations but freely written concepts.

Data format

dev-explained.jsonl and test-explained.jsonl each contain Balanced COPA samples with added explanations in .jsonl format. The question ids match the original questions of the development and test set, respectively.

Each entry contains:

  • the original question (matching format and ids)
  • human-explanations: a list of explanations each containing:
    • expl-id: the explanation id
    • text: the explanation in plain text (full sentences)
    • worker-id: anonymized worker id (the author of the explanation)
    • worker-avg: the average score the author got for their explanations
    • all-ratings: all collected ratings for the explanation
    • filtered-ratings: ratings excluding those that failed the control
    • triples: the triple-form explanation (a list of ConceptNet-like triples)

Example entry:

id: 1, 
asks-for: cause, 
most-plausible-alternative: 1,
p: "My body cast a shadow over the grass.", 
a1: "The sun was rising.", 
a2: "The grass was cut.", 
human-explanations: [
    {expl-id: f4d9b407-681b-4340-9be1-ac044f1c2230, 
     text: "Sunrise causes casted shadows.", 
     worker-id: 3a71407b-9431-49f9-b3ca-1641f7c05f3b, 
     worker-avg: 3.5832864694635025, 
     all-ratings: [1, 3, 3, 4, 3], 
     filtered-ratings: [3, 3, 4, 3], 
     filtered-avg-rating: 3.25, 
     triples: [["sunrise", "Causes", "casted shadows"]]
     }, ...]

Aggregated versions

graphs.pkl contains aggregated versions of the triples for each question in a dictionary format with COPA question ids as the key.

Each entry contains a list of edges, each being a tuple of (u, v, {'rel': relation, 'weight': weight}). Similar nodes were connected or merged with relatedto, depending on the cosine similarity between their SentenceTransformer embeddings. The weight is the average score of the explanation the edge originated from (summed if multiple), or 1.0 if the edge was automatically generated.

  • Note: not all graphs are (weakly) connected.

Example entry:

1: [('sunrise', 'casted_shadows', {'rel': 'causes', 'weight': 3.25}),
  ('sunrise', 'sun', {'rel': 'relatedto', 'weight': 1.0}),
  ('casted_shadows', 'the_shadow', {'rel': 'relatedto', 'weight': 1.0}),
  ('sun_rising', 'bringing_light', {'rel': 'hasproperty', 'weight': 4.25}),
  ('sun_rising', 'a_sun_raising', {'rel': 'relatedto', 'weight': 1.0}),
 ...
]

Citation

Thank you for your interest in our dataset! If you use it in your research, please cite:

@misc{brassard2022copasse,
    title={COPA-SSE: Semi-structured Explanations for Commonsense Reasoning},
    author={Ana Brassard and Benjamin Heinzerling and Pride Kavumba and Kentaro Inui},
    year={2022},
    eprint={2201.06777},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Owner
Ana Brassard
Ana Brassard
Code for our paper "Interactive Analysis of CNN Robustness"

Perturber Code for our paper "Interactive Analysis of CNN Robustness" Datasets Feature visualizations: Google Drive Fine-tuning checkpoints as saved m

Stefan Sietzen 0 Aug 17, 2021
Functional deep learning

Pipeline abstractions for deep learning. Full documentation here: https://lf1-io.github.io/padl/ PADL: is a pipeline builder for PyTorch. may be used

LF1 101 Nov 09, 2022
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Perceiver - Pytorch Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch Install $ pip install perceiver-pytorch Usage

Phil Wang 876 Dec 29, 2022
A new GCN model for Point Cloud Analyse

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for VA-GCN in pytorch. Classification (ModelNet10/40) Data Preparation D

12 Feb 02, 2022
Centroid-UNet is deep neural network model to detect centroids from satellite images.

Centroid UNet - Locating Object Centroids in Aerial/Serial Images Introduction Centroid-UNet is deep neural network model to detect centroids from Aer

GIC-AIT 19 Dec 08, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022
Implementation of ICCV 2021 oral paper -- A Novel Self-Supervised Learning for Gaussian Mixture Model

SS-GMM Implementation of ICCV 2021 oral paper -- Self-Supervised Image Prior Learning with GMM from a Single Noisy Image with supplementary material R

HUST-The Tan Lab 4 Dec 05, 2022
A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

jie jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model

Bojing Jia 9 Sep 29, 2022
This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

ERFNet This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation. NEW!! New PyTorch

Edu 104 Jan 05, 2023
PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

EML Tübingen 70 Dec 27, 2022
NeurIPS 2021, self-supervised 6D pose on category level

SE(3)-eSCOPE video | paper | website Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation Xiaolong Li, Yijia Weng,

Xiaolong 63 Nov 22, 2022
A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

443 Jan 06, 2023
Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Geometry Processing with Neural Fields Pytorch implementation for the NeurIPS 2021 paper: Geometry Processing with Neural Fields Guandao Yang, Serge B

Guandao Yang 162 Dec 16, 2022
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

Simple-DMA a simple Dual Memory Architecture for classifications. based on the paper Dual-Memory Deep Learning Architectures for Lifelong Learning of

1 Jan 27, 2022
Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

DONGJUN LEE 82 Oct 22, 2022
This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Up

19 Jan 16, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

256 Dec 28, 2022
Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Implicit-Semantic-Response-Alignment Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation" Prerequisites pyt

4 Dec 19, 2022
FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale 40 Dec 13, 2022