Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

Last update: Dec 23, 2022

Related tags

Deep Learning incremental_tree_edit

Overview

Learning Structural Edits via Incremental Tree Transformations

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

1. Prepare Environment

We recommend using conda to manage the environment:

conda env create -n "structural_edits" -f structural_edits.yml
conda activate structural_edits

Install the punkt tokenizer:

python
>>> import nltk
>>> nltk.download('punkt')
>>> <ctrl-D>

2. Data

Please extract the datasets and vocabulary files by:

cd source_data
tar -xzvf githubedits.tar.gz

All necessary source data has been included as the following:

| --source_data
|       |-- githubedits
|           |-- githubedits.{train|train_20p|dev|test}.jsonl
|           |-- csharp_fixers.jsonl
|           |-- vocab.from_repo.{080910.freq10|edit}.json
|           |-- Syntax.xml
|           |-- configs
|               |-- ...(model config json files)

A sample file containing 20% of the GitHubEdits training data is included as source_data/githubedits/githubedits.train_20p.jsonl for running small experiments.

We have generated and included the vocabulary files as well. To create your own vocabulary, see edit_components/vocab.py.

3. Experiments

See training and test scripts in scripts/githubedits/. Please configure the PYTHONPATH environment variable in line 6.

3.1 Training

For training, uncomment the desired setting in scripts/githubedits/train.sh and run:

bash scripts/githubedits/train.sh source_data/githubedits/configs/CONFIGURATION_FILE

where CONFIGURATION_FILE is the json file of your setting.

Supervised Learning

For example, if you want to train Graph2Edit + Sequence Edit Encoder on GitHubEdits's 20% sample data, please uncomment only line 21-25 in scripts/githubedits/train.sh and run:

bash scripts/githubedits/train.sh source_data/githubedits/configs/graph2iteredit.seq_edit_encoder.20p.json

(Note: when you run the experiment for the first time, you might need to wait for ~15 minutes for data preprocessing.)

Imitation Learning

To further train the model with PostRefine imitation learning, please replace FOLDER_OF_SUPERVISED_PRETRAINED_MODEL with your model dir in source_data/githubedits/configs/graph2iteredit.seq_edit_encoder.20p.postrefine.imitation.json. Uncomment only line 27-31 in scripts/githubedits/train.sh and run:

bash scripts/githubedits/train.sh source_data/githubedits/configs/graph2iteredit.seq_edit_encoder.20p.postrefine.imitation.json

3.2 Test

To test a trained model, first uncomment only the desired setting in scripts/githubedits/test.sh and replace work_dir with your model directory, and then run:

bash scripts/githubedits/test.sh

4. Reference

If you use our code and data, please cite our paper:

@inproceedings{yao2021learning,
    title={Learning Structural Edits via Incremental Tree Transformations},
    author={Ziyu Yao and Frank F. Xu and Pengcheng Yin and Huan Sun and Graham Neubig},
    booktitle={International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=v9hAX77--cZ}
}

Our implementation is adapted from TranX and Graph2Tree. We are grateful to the two work!

@inproceedings{yin18emnlpdemo,
    title = {{TRANX}: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation},
    author = {Pengcheng Yin and Graham Neubig},
    booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP) Demo Track},
    year = {2018}
}
@inproceedings{yin2018learning,
    title={Learning to Represent Edits},
    author={Pengcheng Yin and Graham Neubig and Miltiadis Allamanis and Marc Brockschmidt and Alexander L. Gaunt},
    booktitle={International Conference on Learning Representations},
    year={2019},
    url={https://openreview.net/forum?id=BJl6AjC5F7},
}

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

Related tags

Overview

Learning Structural Edits via Incremental Tree Transformations

1. Prepare Environment

2. Data

3. Experiments

3.1 Training

Supervised Learning

Imitation Learning

3.2 Test

4. Reference

Owner

NeuLab

MEND: Model Editing Networks using Gradient Decomposition

Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

A list of Machine Learning Art Colabs

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

Taichi Course Homework Template

PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation.

A modern pure-Python library for reading PDF files

Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Plenoxels: Radiance Fields without Neural Networks

Goal of the project : Detecting Temporal Boundaries in Sign Language videos

A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

Code for the paper: Fighting Fake News: Image Splice Detection via Learned Self-Consistency

TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.

Implementation for paper: Self-Regulation for Semantic Segmentation

SMPL-X: A new joint 3D model of the human body, face and hands together

A real-time speech emotion recognition application using Scikit-learn and gradio

Learning with Subset Stacking

The object detection pipeline is based on Ultralytics YOLOv5

Artificial Neural network regression model to predict the energy output in a combined cycle power plant.