Repository for the Bias Benchmark for QA dataset.

Related tags

Deep LearningBBQ
Overview

BBQ

Repository for the Bias Benchmark for QA dataset.

Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Thompson, Phu Mon Htut, and Samuel R. Bowman.

About BBQ

It is well documented that NLP models learnsocial biases present in the world, but littlework has been done to show how these biasesmanifest in actual model outputs for appliedtasks like question answering (QA). We introduce the Bias Benchmark for QA (BBQ), adataset consisting of question-sets constructedby the authors that highlightattestedsocialbiases against people belonging to protectedclasses along nine different social dimensionsrelevant for U.S. English-speaking contexts.Our task evaluates model responses at two distinct levels: (i) given an under-informative context, test how strongly model answers reflectsocial biases, and (ii) given an adequately informative context, test whether the model’s biases still override a correct answer choice. Wefind that models strongly rely on stereotypeswhen the context is ambiguous, meaning thatthe model’s outputs consistently reproduceharmful biases in this setting. Though modelsare much more accurate when the context provides an unambiguous answer, they still relyon stereotyped information and achieve an accuracy 2.5 percentage points higher on examples where the correct answer aligns with a social bias, with this accuracy difference widening to over 5 points for examples targeting gender.

The paper

You can read our paper "BBQ: A Hand-Built Bias Benchmark for Question Answering" here.

File structure

  • data
    • Description: This folder contains each set of generated examples for BBQ. This is the folder you would use to test BBQ.
    • Contents: 11 jsonl files, each containing all templated examples. Each category is a separate file.
  • results
    • Description: This folder contains our results after running BBQ on UnifiedQA
    • Contents: 11 jsonl files, each containing all templated examples and three sets of results for each example line:
      • Predictions using ARC-format
      • Predictions using RACE-format
      • Predictions using a question-only baseline
  • supplemental
    • Description: Additional files used in validation and selecting names for the vocabulary
    • Contents:
      • MTurk_validation contains the HIT templates, scripts, input data, and results from our MTurk validations
      • name_job_data contains files downloaded that contain name & demographic information or occupation prestige scores for developing these portions of the vocabulary
  • templates
    • Description: This folder contains all the templates and vocabulary used to create BBQ
    • Contents: 11 csv files that contain the templates used in BBQ, 1 csv file listing all filler items used in the validation, 2 csv files for the BBQ vocabulary.
Owner
ML² AT CILVR
The Machine Learning for Language Group at NYU CILVR
ML² AT CILVR
It's final year project of Diploma Engineering. This project is based on Computer Vision.

Face-Recognition-Based-Attendance-System It's final year project of Diploma Engineering. This project is based on Computer Vision. Brief idea about ou

Neel 10 Nov 02, 2022
This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

DeLightCMU 212 Jan 08, 2023
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

naqs-for-quantum-chemistry This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio qua

Tom Barrett 24 Dec 23, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 03, 2023
3D-Reconstruction 基于深度学习方法的单目多视图三维重建

基于深度学习方法的单目多视图三维重建 Part I 三维重建 代码:Part1 技术文档:[Markdown] [PDF] 原始图像:Original Images 点云结果:Point Cloud Results-1

HMT_Curo 19 Dec 26, 2022
Code for the paper "Attention Approximates Sparse Distributed Memory"

Attention Approximates Sparse Distributed Memory - Codebase This is all of the code used to run analyses in the paper "Attention Approximates Sparse D

Trenton Bricken 14 Dec 05, 2022
Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph Model Description Open-CyKG is a framework that is constructed using an attenti

Injy Sarhan 34 Jan 05, 2023
Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification (NeurIPS 2021)

Graph Posterior Network This is the official code repository to the paper Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classifica

Maximilian Stadler 30 Dec 05, 2022
Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker This is a full project of image segmentation using the model built with

Htin Aung Lu 1 Jan 04, 2022
Multiwavelets-based operator model

Multiwavelet model for Operator maps Gaurav Gupta, Xiongye Xiao, and Paul Bogdan Multiwavelet-based Operator Learning for Differential Equations In Ne

Gaurav 33 Dec 04, 2022
Enigma-Plus - Python based Enigma machine simulator with some extra features

Enigma-Plus Python based Enigma machine simulator with some extra features Examp

1 Jan 05, 2022
Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System

Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System The possibilities to involve

Babu Kumaran Nalini 0 Nov 19, 2021
Xi Dongbo 78 Nov 29, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Jan 06, 2023
Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

A project for counting vehicles using YOLOv4 + DeepSORT + Flask + Ngrok

Duong Tran Thanh 37 Dec 16, 2022
Neural Module Network for VQA in Pytorch

Neural Module Network (NMN) for VQA in Pytorch Note: This is NOT an official repository for Neural Module Networks. NMN is a network that is assembled

Harsh Trivedi 111 Nov 24, 2022
A keras implementation of ENet (abandoned for the foreseeable future)

ENet-keras This is an implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from ENet-training (lua-t

Pavlos 115 Nov 23, 2021
Framework web SnakeServer.

SnakeServer - Framework Web 🐍 Documentação oficial do framework SnakeServer. Conteúdo Sobre Como contribuir Enviar relatórios de segurança Pull reque

Jaedson Silva 0 Jul 21, 2022