Measuring and Improving Consistency in Pretrained Language Models

Related tags

Deep Learningpararel
Overview

ParaRel 🀘

This repository contains the code and data for the paper:

Measuring and Improving Consistency in Pretrained Language Models

as well as the resource: ParaRel 🀘

Since this work required running a lot of experiments, it is structured by scripts that automatically runs many sub-experiments, on parallel servers, and tracking using an experiment tracking website: wandb, which are then aggregated using a jupyter notebook. To run all the experiments I used task spooler, a queue-based software that allows to run multiple commands in parallel (and store the rest in a queue)

It is also possible to run individual experiments, for which one can look for in the corresponding script.

For any question, query regarding the code, or paper, please reach out at [email protected]

ParaRel 🀘

If you're only interested in the data, you can find it under data. Each file contains the paraphrases patterns for a specific relation, in a json file.

Create environment

conda create -n pararel python=3.7 anaconda
conda activate pararel

pip install -r requirements.txt

add project to path:

export PYTHONPATH=${PYTHONPATH}:/path-to-project

Setup

In case you just want to start with the filtered data we used (filtering objects that consist more than a single word piece in the LMs we considered), you can find them here. Otherwise:

First, begin by downloading the trex dataset from here, alternatively, check out the LAMA github repo. Download it to the following folder so that the following folder would exist: data/trex/data/TREx along with the relevant files

Next, in case you want to rerun automatically some/all of the experiments, you will need to update the paths in the runs scripts with your folder path and virtual environment.

Run Scripts

Filter data from trex, to include only triplets that appear in the inspected LMs in this work: bert-base-cased, roberta-base, albert-base-v2 (as well as the larger versions, that contain the same vocabulary)

python runs/pararel/filter.py

A single run looks like the following:

python lm_meaning/lm_entail/filter_data.py \
       --in_data data/trex/data/TREx/P106.jsonl \
       --model_names bert-base-cased,bert-large-cased,bert-large-cased-whole-word-masking,roberta-base,roberta-large,albert-base-v2,albert-xxlarge-v2 \
       --out_file data/trex_lms_vocab/P106.jsonl

Evaluate consistency:

python runs/eval/run_lm_consistent.py

A single run looks like the following:

python pararel/consistency/encode_consistency_probe.py \
       --data_file data/trex_lms_vocab/P106.jsonl \
       --lm bert-base-cased \
       --graph data/pattern_data/graphs/P106.graph \
       --gpu 0 \
       --wandb \
       --use_targets

Encode the patterns along with the subjects, to save the representations:

python runs/pararel/encode_text.py

A single run looks like the following:

python lm_meaning/encode/encode_text.py \
       --patterns_file data/pattern_data/graphs_json/P106.jsonl \
       --data_file data/trex_lms_vocab/P106.jsonl \
       --lm bert-base-cased \
       --pred_file data/output/representations/P106_bert-base-cased.npy \
       --wandb

Improving Consistency with ParaRel

The code and README are available here

FAQ

Q: Why do you report 31 N-1 relations, whereas in the LAMA paper there are only 25?

A: Explanation

Citation:

If you find this work relevant to yours, please cite us:

@article{Elazar2021MeasuringAI,
  title={Measuring and Improving Consistency in Pretrained Language Models},
  author={Yanai Elazar and Nora Kassner and Shauli Ravfogel and Abhilasha Ravichander and Ed Hovy and Hinrich Schutze and Yoav Goldberg},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.01017}
}
Owner
Yanai Elazar
PhD student at Bar-Ilan University, Israel
Yanai Elazar
ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Ibai Gorordo 14 Dec 09, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Distribution-Balanced Loss [Paper] The implementation of our paper Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets (

Tong WU 304 Dec 22, 2022
Training and Evaluation Code for Neural Volumes

Neural Volumes This repository contains training and evaluation code for the paper Neural Volumes. The method learns a 3D volumetric representation of

Meta Research 370 Dec 08, 2022
Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics 🌊 🌊 🌊 together with Finite Differences, explicit time

Felix KΓΆhler 4 Nov 12, 2022
Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection". LRPDenseNet.py

Pedro Ricardo Ariel Salvador Bassi 2 Sep 21, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed+Megatron trained the world's most powerful language model: MT-530B DeepSpeed is hiring, come join us! DeepSpeed is a deep learning optimizat

Microsoft 8.4k Dec 28, 2022
Efficient neural networks for analog audio effect modeling

micro-TCN Efficient neural networks for audio effect modeling

Christian Steinmetz 94 Dec 29, 2022
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Official Code Release for Container : Context Aggregation Network

Container: Context Aggregation Network Official Code Release for Container : Context Aggregation Network Comparion between CNN, MLP-Mixer and Transfor

peng gao 42 Nov 17, 2021
LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021

LoFTR-with-train-script LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021 (with train script --- unofficial ---). About Megadepth

Nan Xiaohu 15 Nov 04, 2022
This is the repository for Learning to Generate Piano Music With Sustain Pedals

SusPedal-Gen This is the official repository of Learning to Generate Piano Music With Sustain Pedals Demo Page Dataset The dataset used in this projec

Joann Ching 12 Sep 02, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022
ExCon: Explanation-driven Supervised Contrastive Learning

ExCon: Explanation-driven Supervised Contrastive Learning Link to the paper: https://arxiv.org/pdf/2111.14271.pdf Contributors of this repo: Zhibo Zha

Zhibo (Darren) Zhang 18 Nov 01, 2022
A benchmark dataset for mesh multi-label-classification based on cube engravings introduced in MeshCNN

Double Cube Engravings This script creates a dataset for multi-label mesh clasification, with an intentionally difficult setup for point cloud classif

Yotam Erel 1 Nov 30, 2021
A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

Google 5.5k Jan 07, 2023
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022
PyTorch deep learning projects made easy.

PyTorch Template Project PyTorch deep learning project made easy. PyTorch Template Project Requirements Features Folder Structure Usage Config file fo

Victor Huang 3.8k Jan 01, 2023
Animate molecular orbital transitions using Psi4 and Blender

Molecular Orbital Transitions (MOT) Animate molecular orbital transitions using Psi4 and Blender Author: Maximilian Paradiz Dominguez, University of A

3 Feb 01, 2022