MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}

MEND: Model Editing Networks using Gradient Decomposition

Related tags

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

Data

Running the code

Citing the paper

Owner

Eric Mitchell

EfficientDet (Scalable and Efficient Object Detection) implementation in Keras and Tensorflow

bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

OpenMMLab Computer Vision Foundation

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

Malware Bypass Research using Reinforcement Learning

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Generative Adversarial Networks for High Energy Physics extended to a multi-layer calorimeter simulation

This implements one of result networks from Large-scale evolution of image classifiers

Our VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks.

HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method)

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

基于Paddle框架的fcanet复现

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

A Pytorch implementation of "LegoNet: Efficient Convolutional Neural Networks with Lego Filters" (ICML 2019).

PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Simple reference implementation of GraphSAGE.