LAnguage Model Analysis

Related tags

Deep LearningLAMA
Overview

LAMA: LAnguage Model Analysis

LAMA

LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models.

The dataset for the LAMA probe is available at https://dl.fbaipublicfiles.com/LAMA/data.zip

LAMA contains a set of connectors to pretrained language models.
LAMA exposes a transparent and unique interface to use:

  • Transformer-XL (Dai et al., 2019)
  • BERT (Devlin et al., 2018)
  • ELMo (Peters et al., 2018)
  • GPT (Radford et al., 2018)
  • RoBERTa (Liu et al., 2019)

Actually, LAMA is also a beautiful animal.

Reference:

The LAMA probe is described in the following papers:

@inproceedings{petroni2019language,
  title={Language Models as Knowledge Bases?},
  author={F. Petroni, T. Rockt{\"{a}}schel, A. H. Miller, P. Lewis, A. Bakhtin, Y. Wu and S. Riedel},
  booktitle={In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019},
  year={2019}
}

@inproceedings{petroni2020how,
  title={How Context Affects Language Models' Factual Predictions},
  author={Fabio Petroni and Patrick Lewis and Aleksandra Piktus and Tim Rockt{\"a}schel and Yuxiang Wu and Alexander H. Miller and Sebastian Riedel},
  booktitle={Automated Knowledge Base Construction},
  year={2020},
  url={https://openreview.net/forum?id=025X0zPfn}
}

The LAMA probe

To reproduce our results:

1. Create conda environment and install requirements

(optional) It might be a good idea to use a separate conda environment. It can be created by running:

conda create -n lama37 -y python=3.7 && conda activate lama37
pip install -r requirements.txt

2. Download the data

wget https://dl.fbaipublicfiles.com/LAMA/data.zip
unzip data.zip
rm data.zip

3. Download the models

DISCLAIMER: ~55 GB on disk

Install spacy model

python3 -m spacy download en

Download the models

chmod +x download_models.sh
./download_models.sh

The script will create and populate a pre-trained_language_models folder. If you are interested in a particular model please edit the script.

4. Run the experiments

python scripts/run_experiments.py

results will be logged in output/ and last_results.csv.

Other versions of LAMA

LAMA-UHN

This repository also provides a script (scripts/create_lama_uhn.py) to create the data used in (Poerner et al., 2019).

Negated-LAMA

This repository also gives the option to evalute how pretrained language models handle negated probes (Kassner et al., 2019), set the flag use_negated_probes in scripts/run_experiments.py. Also, you should use this version of the LAMA probe https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz

What else can you do with LAMA?

1. Encode a list of sentences

and use the vectors in your downstream task!

pip install -e git+https://github.com/facebookresearch/LAMA#egg=LAMA
import argparse
from lama.build_encoded_dataset import encode, load_encoded_dataset

PARAMETERS= {
        "lm": "bert",
        "bert_model_name": "bert-large-cased",
        "bert_model_dir":
        "pre-trained_language_models/bert/cased_L-24_H-1024_A-16",
        "bert_vocab_name": "vocab.txt",
        "batch_size": 32
        }

args = argparse.Namespace(**PARAMETERS)

sentences = [
        ["The cat is on the table ."],  # single-sentence instance
        ["The dog is sleeping on the sofa .", "He makes happy noises ."],  # two-sentence
        ]

encoded_dataset = encode(args, sentences)
print("Embedding shape: %s" % str(encoded_dataset[0].embedding.shape))
print("Tokens: %r" % encoded_dataset[0].tokens)

# save on disk the encoded dataset
encoded_dataset.save("test.pkl")

# load from disk the encoded dataset
new_encoded_dataset = load_encoded_dataset("test.pkl")
print("Embedding shape: %s" % str(new_encoded_dataset[0].embedding.shape))
print("Tokens: %r" % new_encoded_dataset[0].tokens)

2. Fill a sentence with a gap.

You should use the symbol [MASK] to specify the gap. Only single-token gap supported - i.e., a single [MASK].

python lama/eval_generation.py  \
--lm "bert"  \
--t "The cat is on the [MASK]."

cat_on_the_phone

cat_on_the_phone

source: https://commons.wikimedia.org/wiki/File:Bluebell_on_the_phone.jpg

Note that you could use this functionality to answer cloze-style questions, such as:

python lama/eval_generation.py  \
--lm "bert"  \
--t "The theory of relativity was developed by [MASK] ."

Install LAMA with pip

Clone the repo

git clone [email protected]:facebookresearch/LAMA.git && cd LAMA

Install as an editable package:

pip install --editable .

If you get an error in mac os x, please try running this instead

CFLAGS="-Wno-deprecated-declarations -std=c++11 -stdlib=libc++" pip install --editable .

Language Model(s) options

Option to indicate which language model(s) to use:

  • --language-models/--lm : comma separated list of language models (REQUIRED)

BERT

BERT pretrained models can be loaded both: (i) passing the name of the model and using huggingface cached versions or (ii) passing the folder containing the vocabulary and the PyTorch pretrained model (look at convert_tf_checkpoint_to_pytorch in here to convert the TensorFlow model to PyTorch).

  • --bert-model-dir/--bmd : directory that contains the BERT pre-trained model and the vocabulary
  • --bert-model-name/--bmn : name of the huggingface cached versions of the BERT pre-trained model (default = 'bert-base-cased')
  • --bert-vocab-name/--bvn : name of vocabulary used to pre-train the BERT model (default = 'vocab.txt')

RoBERTa

  • --roberta-model-dir/--rmd : directory that contains the RoBERTa pre-trained model and the vocabulary (REQUIRED)
  • --roberta-model-name/--rmn : name of the RoBERTa pre-trained model (default = 'model.pt')
  • --roberta-vocab-name/--rvn : name of vocabulary used to pre-train the RoBERTa model (default = 'dict.txt')

ELMo

  • --elmo-model-dir/--emd : directory that contains the ELMo pre-trained model and the vocabulary (REQUIRED)
  • --elmo-model-name/--emn : name of the ELMo pre-trained model (default = 'elmo_2x4096_512_2048cnn_2xhighway')
  • --elmo-vocab-name/--evn : name of vocabulary used to pre-train the ELMo model (default = 'vocab-2016-09-10.txt')

Transformer-XL

  • --transformerxl-model-dir/--tmd : directory that contains the pre-trained model and the vocabulary (REQUIRED)
  • --transformerxl-model-name/--tmn : name of the pre-trained model (default = 'transfo-xl-wt103')

GPT

  • --gpt-model-dir/--gmd : directory that contains the gpt pre-trained model and the vocabulary (REQUIRED)
  • --gpt-model-name/--gmn : name of the gpt pre-trained model (default = 'openai-gpt')

Evaluate Language Model(s) Generation

options:

  • --text/--t : text to compute the generation for
  • --i : interactive mode
    one of the two is required

example considering both BERT and ELMo:

python lama/eval_generation.py \
--lm "bert,elmo" \
--bmd "pre-trained_language_models/bert/cased_L-24_H-1024_A-16/" \
--emd "pre-trained_language_models/elmo/original/" \
--t "The cat is on the [MASK]."

example considering only BERT with the default pre-trained model, in an interactive fashion:

python lamas/eval_generation.py  \
--lm "bert"  \
--i

Get Contextual Embeddings

python lama/get_contextual_embeddings.py \
--lm "bert,elmo" \
--bmn bert-base-cased \
--emd "pre-trained_language_models/elmo/original/"

Unified vocabulary

The intersection of the vocabularies for all considered models

Troubleshooting

If the module cannot be found, preface the python command with PYTHONPATH=.

If the experiments fail on GPU memory allocation, try reducing batch size.

Acknowledgements

Other References

  • (Kassner et al., 2019) Nora Kassner, Hinrich Schütze. Negated LAMA: Birds cannot fly. arXiv preprint arXiv:1911.03343, 2019.

  • (Poerner et al., 2019) Nina Poerner, Ulli Waltinger, and Hinrich Schütze. BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA. arXiv preprint arXiv:1911.03681, 2019.

  • (Dai et al., 2019) Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc V. Le, and Ruslan Salakhutdi. Transformer-xl: Attentive language models beyond a fixed-length context. CoRR, abs/1901.02860.

  • (Peters et al., 2018) Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. NAACL-HLT 2018

  • (Devlin et al., 2018) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.

  • (Radford et al., 2018) Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.

  • (Liu et al., 2019) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.

Licence

LAMA is licensed under the CC-BY-NC 4.0 license. The text of the license can be found here.

Owner
Meta Research
Meta Research
Code repo for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper.

InterpretableMDE A PyTorch implementation for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper. arXiv link: https://arxiv.or

Zunzhi You 16 Aug 12, 2022
Predict stock movement with Machine Learning and Deep Learning algorithms

Project Overview Stock market movement prediction using LSTM Deep Neural Networks and machine learning algorithms Software and Library Requirements Th

Naz Delam 46 Sep 13, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 04, 2020
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
AnimationKit: AI Upscaling & Interpolation using Real-ESRGAN+RIFE

ALPHA 2.5: Frostbite Revival (Released 12/23/21) Changelog: [ UI ] Chained design. All steps link to one another! Use the master override toggles to s

87 Nov 16, 2022
Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

LibraNet This repository includes the official implementation of LibraNet for crowd counting, presented in our paper: Weighing Counts: Sequential Crow

Hao Lu 18 Nov 05, 2022
Implementation for the paper: Invertible Denoising Network: A Light Solution for Real Noise Removal (CVPR2021).

Invertible Image Denoising This is the PyTorch implementation of paper: Invertible Denoising Network: A Light Solution for Real Noise Removal (CVPR 20

157 Dec 25, 2022
TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

Kwotsin 255 Oct 17, 2022
ICSS - Interactive Continual Semantic Segmentation

Presentation This repository contains the code of our paper: Weakly-supervised c

Alteia 9 Jul 23, 2022
[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

SelectionGAN for Guided Image-to-Image Translation CVPR Paper | Extended Paper | Guided-I2I-Translation-Papers Citation If you use this code for your

Hao Tang 424 Dec 02, 2022
The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

Yufei Wang 176 Jan 06, 2023
Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非

Nitin 53 Nov 17, 2022
Backend code to use MCPI's python API to make infinite worlds with custom generation

inf-mcpi Backend code to use MCPI's python API to make infinite worlds with custom generation Does not save player-placed blocks! Generation is still

5 Oct 04, 2022
Generating Digital Painting Lighting Effects via RGB-space Geometry (SIGGRAPH2020/TOG2020)

Project PaintingLight PaintingLight is a project conducted by the Style2Paints team, aimed at finding a method to manipulate the illumination in digit

651 Dec 29, 2022
Message Passing on Cell Complexes

CW Networks This repository contains the code used for the papers Weisfeiler and Lehman Go Cellular: CW Networks (Under review) and Weisfeiler and Leh

Twitter Research 108 Jan 05, 2023
Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network This repository is the implementation of ACE-HGNN in PyTorch. Environment pyt

9 Nov 28, 2022
A TensorFlow Implementation of "Deep Multi-Scale Video Prediction Beyond Mean Square Error" by Mathieu, Couprie & LeCun.

Adversarial Video Generation This project implements a generative adversarial network to predict future frames of video, as detailed in "Deep Multi-Sc

Matt Cooper 704 Nov 26, 2022
Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

Edward Beeching 326 Dec 30, 2022
Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,

Evangelos Kazakos 13 Nov 24, 2022