Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Overview

Implicit Representations of Meaning in Neural Language Models

Preliminaries

Create and set up a conda environment as follows:

conda create -n state-probes python=3.7
conda activate state-probes
pip install -r requirements.txt

Install the appropriate torch 1.7.0 for your cuda version:

conda install pytorch==1.7.0 cudatoolkit=<cuda_version> -c pytorch

Before running any command below, run

export PYTHONPATH=.
export TOKENIZERS_PARALLELISM=true

Data

The Alchemy data is downloaded from their website.

wget https://nlp.stanford.edu/projects/scone/scone.zip
unzip scone.zip

The synthetic version of alchemy was generated by running:

echo 0 > id #the code requires a file called id with a number in it ...
python alchemy_artificial_generator.py --num_scenarios 3600 --output synth_alchemy_train
python alchemy_artificial_generator.py --num_scenarios 500 --output synth_alchemy_dev
python alchemy_artificial_generator.py --num_scenarios 900 --output synth_alchemy_test

You can also just download our generated data through:

wget http://web.mit.edu/bzl/www/synth_alchemy.tar.gz
tar -xzvf synth_alchemy.tar.gz

The Textworld data is under

wget http://web.mit.edu/bzl/www/tw_data.tar.gz
tar -xzvf tw_data.tar.gz

LM Training

To train a BART or T5 model on Alchemy data

python scripts/train_alchemy.py \
    --arch [t5|bart] [--no_pretrain] \
    [--synthetic] --encode_init_state NL

Saves model checkpoints under sconeModels/*.

To train a BART or T5 model on Textworld data

python scripts/train_textworld.py \
    --arch [t5/bart] [--no_pretrain] \
    --data tw_data/simple_traces --gamefile tw_games/simple_games

Saves model checkpoints nder twModels/*.

Probe Training & Evaluation

Alchemy

The main probe command is as follows:

python scripts/probe_alchemy.py \
    --arch [bart|t5] --lm_save_path <path_to_lm_checkpoint> [--no_pretrain] \
    --encode_init_state NL --nonsynthetic \
    --probe_target single_beaker_final.NL --localizer_type single_beaker_init_full \
    --probe_type linear --probe_agg_method avg \
    --encode_tgt_state NL.[bart|t5] --tgt_agg_method avg \
    --batchsize 128 --eval_batchsize 1024 --lr 1e-4

For evaluation, add --eval_only --probe_save_path <path_to_probe_checkpoint>. This will save model predictions to a .jsonl file under the same directory as the probe checkpoint.

Add --control_input for No LM experiment.

Change --probe_target to single_beaker_init.NL to decode initial state.

For localization experiments, set --localizer_type single_beaker_init_{$i}.offset{$off} for some token i in {article, pos.[R0|R1|R2], beaker.[R0|R1], verb, amount, color, end_punct} and some integer offset off between 0 and 6.

Saves probe checkpoints under probe_models_alchemy/*.

Intervention experiment results follow from running the script:

python scripts/intervention.py \
    --arch [bart|t5] \
    --encode_init_state NL \
    --create_type drain_1 \
    --lm_save_path <path_to_lm_checkpoint>

which creates two contexts and replaces a select few encoded tokens to modify the underlying belief state.

Textworld

Begin by creating the full set of encoded proposition representations

python scripts/get_all_tw_facts.py \
    --data tw_data/simple_traces --gamefile tw_data/simple_games \
    --state_model_arch [bart|t5] \
    --probe_target belief_facts_pair \
    --state_model_path [None|pretrain|<path_to_lm_checkpoint>] \
    --out_file <path_to_prop_encodings>

Run the probe with

python scripts/probe_textworld.py \
    --arch [bart|t5] --data tw_data/simple_traces --gamefile tw_data/simple_games \
    --probe_target final.full_belief_facts_pair --encode_tgt_state NL.[bart|t5] \
    --localizer_type belief_facts_pair_[first|last|all] --probe_type 3linear_classify \
    --probe_agg_method avg --tgt_agg_method avg \
    --lm_save_path <path_to_lm_checkpoint> [--no_pretrain] \
    --ents_to_states_file <path_to_prop_encodings> \
    --eval_batchsize 256 --batchsize 32

For evaluation, add --eval_only --probe_save_path <path_to_probe_checkpoint>. This will save model predictions to a .jsonl file under the same directory as the probe checkpoint.

Add --control_input for No LM experiment.

Change --probe_target to init.full_belief_facts_pair to decode initial state.

For remap experiments, change --probe_target to final.full_belief_facts_pair.control_with_rooms.

For decoding from just one side of propositions, replace any instance of belief_facts_pair (in --probe_target and --localizer_type) with belief_facts_single and rerun both commands (first get the full proposition encodings, then run the probe).

Saves probe checkpoints under probe_models_textworld/*.

Print Metrics

Print full metrics (state EM, entity EM, subdivided by relations vs. propositions, etc.) using scripts/print_metrics.py.

python scripts/print_metrics.py \
    --arch [bart|t5] --domain [alchemy|textworld] \
    --pred_files <path_to_model_predictions_1>,<path_to_model_predictions_2>,<path_to_model_predictions_3>,... \
    [--use_remap_domain --remap_fn <path_to_remap_model_predictions>] \
    [--single_side_probe]
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 01, 2023
ESL: Event-based Structured Light

ESL: Event-based Structured Light Video (click on the image) This is the code for the 2021 3DV paper ESL: Event-based Structured Light by Manasi Mugli

Robotics and Perception Group 29 Oct 24, 2022
C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion By: David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedal

Meta Research 309 Dec 16, 2022
FID calculation with proper image resizing and quantization steps

clean-fid: Fixing Inconsistencies in FID Project | Paper The FID calculation involves many steps that can produce inconsistencies in the final metric.

Gaurav Parmar 606 Jan 06, 2023
Python package provinding tools for artistic interactive applications using AI

Documentation redrawing Python package provinding tools for artistic interactive applications using AI Created by ReDrawing Campinas team for the Open

ReDrawing Campinas 1 Sep 30, 2021
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).

GAM ⠀⠀ A PyTorch implementation of Graph Classification Using Structural Attention (KDD 2018). Abstract Graph classification is a problem with practic

Benedek Rozemberczki 259 Dec 05, 2022
Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Philipp Erler 329 Jan 06, 2023
Using VapourSynth with super resolution models and speeding them up with TensorRT.

VSGAN-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Using NVIDIA/Torch-TensorRT combined wi

111 Jan 05, 2023
Contenido del curso Bases de datos del DCC PUC versión 2021-2

IIC2413 - Bases de Datos Tabla de contenidos Equipo Profesores Ayudantes Contenidos Calendario Evaluaciones Resumen de notas Foro Política de integrid

54 Nov 23, 2022
A developer interface for creating Chat AIs for the Chai app.

ChaiPy A developer interface for creating Chat AIs for the Chai app. Usage Local development A quick start guide is available here, with a minimal exa

Chai 28 Dec 28, 2022
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 01, 2022
The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Introduction This repository includes the source code for "Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks", which is pu

machen 11 Nov 27, 2022
Context Axial Reverse Attention Network for Small Medical Objects Segmentation

CaraNet: Context Axial Reverse Attention Network for Small Medical Objects Segmentation This repository contains the implementation of a novel attenti

401 Dec 23, 2022
(AAAI 2021) Progressive One-shot Human Parsing

End-to-end One-shot Human Parsing This is the official repository for our two papers: Progressive One-shot Human Parsing (AAAI 2021) End-to-end One-sh

54 Dec 30, 2022
Two-stage CenterNet

Probabilistic two-stage detection Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network. Probabilistic two-st

Xingyi Zhou 1.1k Jan 03, 2023
Semi-supervised semantic segmentation needs strong, varied perturbations

Semi-supervised semantic segmentation using CutMix and Colour Augmentation Implementations of our papers: Semi-supervised semantic segmentation needs

146 Dec 20, 2022
Implementation of Shape and Electrostatic similarity metric in deepFMPO.

DeepFMPO v3D Code accompanying the paper "On the value of using 3D-shape and electrostatic similarities in deep generative methods". The paper can be

34 Nov 28, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 07, 2023
Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

PPGS: Planning from Pixels in Environments with Combinatorially Hard Search Spaces Environment Setup We recommend pipenv for creating and managing vir

Autonomous Learning Group 11 Jun 26, 2022