Multi-Stage Episodic Control for Strategic Exploration in Text Games

Last update: May 24, 2022

Overview

XTX: eXploit - Then - eXplore

Requirements

First clone this repo using git clone https://github.com/princeton-nlp/XTX.git

Please create two conda environments as follows:

conda env create -f yml_envs/jericho-wt.yml
a. conda activate jericho-wt
b. pip install git+https://github.com/jens321/[email protected]
conda env create -f yml_envs/jericho-no-wt.yml

The first set of commands will create a conda environment called jericho-wt which has added actions to the game grammar for specific games (see games with * in the paper). The second command will create another conda environment called jericho-no-wt which installs an unmodified version of the Jericho library.

Training

All code can be run from the root folder of this project. Please follow the commands below for each specific model:

XTX: sh scripts/run_xtx.sh
XTX (no-mix): sh scripts/run_xtx_no_mix.sh
XTX (uniform): sh scrtips/run_xtx_uniform.sh
XTX ($\lambda$ = 0, 0.5, or 1): sh scripts/run_xtx_ablation.sh
INV DY: sh scripts/run_inv_dy.sh
DRRN: sh scripts/run_drrn.sh

Notes

You can use analysis/sample_env.py for quickly playing around with a sample Jericho environment. Run it using python3 -m analysis.sample_env.
You can use analysis/augment_wt.py for generating the missing action candidates that can be added to the game grammar (games with * in the paper). Run it using python3 -m analysis.augment_wt.
Note that all models should finish within a day or two given 1 gpu and 8 cpus, except for games where Jericho's valid action handicap is slow (e.g. Library, Dragon). Since Jericho's valid action handicap heavily relies on parallelization, increasing the number of cpus also results in good speedups (e.g. 8 -> 16).

Acknowledgements

We used Weights & Biases for experiment tracking and visualizations to develop insights for this paper.

Some of the code borrows from the TDQN repo.

For any questions please contact Jens Tuyls ([email protected]).

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Related tags

Overview

XTX: eXploit - Then - eXplore

Requirements

Training

Notes

Acknowledgements

Owner

Princeton Natural Language Processing

Testing and Estimation of structural breaks in Stata

Contrastive Loss Gradient Attack (CLGA)

Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations, CVPR 2019 (Oral)

Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

Fast and robust certifiable relative pose estimation

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers

Awesome Weak-Shot Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

AI drive app that can help user become beautiful.

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

This is the open-source reference implementation of the SIGGRAPH 2021 paper Intersection-free Rigid Body Dynamics.

Supplementary materials for ISMIR 2021 LBD paper "Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes"

An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wheat Detection (2021).

CLASP - Contrastive Language-Aminoacid Sequence Pretraining

Pytorch code for "DPFM: Deep Partial Functional Maps" - 3DV 2021 (Oral)