Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

Overview

Gated-Attention Architectures for Task-Oriented Language Grounding

This is a PyTorch implementation of the AAAI-18 paper:

Gated-Attention Architectures for Task-Oriented Language Grounding
Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
Carnegie Mellon University

Project Website: https://sites.google.com/view/gated-attention

example

This repository contains:

  • Code for training an A3C-LSTM agent using Gated-Attention
  • Code for Doom-based language grounding environment

Dependencies

(We recommend using Anaconda)

Usage

Using the Environment

For running a random agent:

python env_test.py

To play in the environment:

python env_test.py --interactive 1

To change the difficulty of the environment (easy/medium/hard):

python env_test.py -d easy

Training Gated-Attention A3C-LSTM agent

For training a A3C-LSTM agent with 32 threads:

python a3c_main.py --num-processes 32 --evaluate 0

The code will save the best model at ./saved/model_best.

To the test the pre-trained model for Multitask Generalization:

python a3c_main.py --evaluate 1 --load saved/pretrained_model

To the test the pre-trained model for Zero-shot Task Generalization:

python a3c_main.py --evaluate 2 --load saved/pretrained_model

To the visualize the model while testing add '--visualize 1':

python a3c_main.py --evaluate 2 --load saved/pretrained_model --visualize 1

To test the trained model, use --load saved/model_best in the above commands.

All arguments for a3c_main.py:

  -h, --help            show this help message and exit
  -l MAX_EPISODE_LENGTH, --max-episode-length MAX_EPISODE_LENGTH
                        maximum length of an episode (default: 30)
  -d DIFFICULTY, --difficulty DIFFICULTY
                        Difficulty of the environment, "easy", "medium" or
                        "hard" (default: hard)
  --living-reward LIVING_REWARD
                        Default reward at each time step (default: 0, change
                        to -0.005 to encourage shorter paths)
  --frame-width FRAME_WIDTH
                        Frame width (default: 300)
  --frame-height FRAME_HEIGHT
                        Frame height (default: 168)
  -v VISUALIZE, --visualize VISUALIZE
                        Visualize the envrionment (default: 0, use 0 for
                        faster training)
  --sleep SLEEP         Sleep between frames for better visualization
                        (default: 0)
  --scenario-path SCENARIO_PATH
                        Doom scenario file to load (default: maps/room.wad)
  --interactive INTERACTIVE
                        Interactive mode enables human to play (default: 0)
  --all-instr-file ALL_INSTR_FILE
                        All instructions file (default:
                        data/instructions_all.json)
  --train-instr-file TRAIN_INSTR_FILE
                        Train instructions file (default:
                        data/instructions_train.json)
  --test-instr-file TEST_INSTR_FILE
                        Test instructions file (default:
                        data/instructions_test.json)
  --object-size-file OBJECT_SIZE_FILE
                        Object size file (default: data/object_sizes.txt)
  --lr LR               learning rate (default: 0.001)
  --gamma G             discount factor for rewards (default: 0.99)
  --tau T               parameter for GAE (default: 1.00)
  --seed S              random seed (default: 1)
  -n N, --num-processes N
                        how many training processes to use (default: 4)
  --num-steps NS        number of forward steps in A3C (default: 20)
  --load LOAD           model path to load, 0 to not reload (default: 0)
  -e EVALUATE, --evaluate EVALUATE
                        0:Train, 1:Evaluate MultiTask Generalization
                        2:Evaluate Zero-shot Generalization (default: 0)
  --dump-location DUMP_LOCATION
                        path to dump models and log (default: ./saved/)

Demostration videos:

Multitask Generalization video: https://www.youtube.com/watch?v=YJG8fwkv7gA

Zero-shot Task Generalization video: https://www.youtube.com/watch?v=JziCKsLrudE

Different stages of training: https://www.youtube.com/watch?v=o_G6was03N0

Cite as

Chaplot, D.S., Sathyendra, K.M., Pasumarthi, R.K., Rajagopal, D. and Salakhutdinov, R., 2017. Gated-Attention Architectures for Task-Oriented Language Grounding. arXiv preprint arXiv:1706.07230. (PDF)

Bibtex:

@article{chaplot2017gated,
  title={Gated-Attention Architectures for Task-Oriented Language Grounding},
  author={Chaplot, Devendra Singh and Sathyendra, Kanthashree Mysore and Pasumarthi, Rama Kumar and Rajagopal, Dheeraj and Salakhutdinov, Ruslan},
  journal={arXiv preprint arXiv:1706.07230},
  year={2017}
}

Acknowledgements

This repository uses ViZDoom API (https://github.com/mwydmuch/ViZDoom) and parts of the code from the API. The implementation of A3C is borrowed from https://github.com/ikostrikov/pytorch-a3c. The poisson-disc code is borrowed from https://github.com/IHautaI/poisson-disc.

Owner
Devendra Chaplot
Ph.D. student in Machine Learning Dept., School of Computer Science, CMU.
Devendra Chaplot
FinEAS: Financial Embedding Analysis of Sentiment 📈

FinEAS: Financial Embedding Analysis of Sentiment 📈 (SentenceBERT for Financial News Sentiment Regression) This repository contains the code for gene

LHF Labs 31 Dec 13, 2022
A Tensorflow implementation of the Text Conditioned Auxiliary Classifier Generative Adversarial Network for Generating Images from text descriptions

A Tensorflow implementation of the Text Conditioned Auxiliary Classifier Generative Adversarial Network for Generating Images from text descriptions

Ayushman Dash 93 Aug 04, 2022
Hough Transform and Hough Line Transform Using OpenCV

Hough transform is a feature extraction method for detecting simple shapes such as circles, lines, etc in an image. Hough Transform and Hough Line Transform is implemented in OpenCV with two methods;

Happy N. Monday 3 Feb 15, 2022
ESP32 python application to read data from a Tiltâ„¢ Hydrometer for homebrewing

TitlESP32 ESP32 MicroPython application to read and log data from a Tiltâ„¢ Hydrometer. Requirements A board with an ESP32 chip USB cable - USB A / micr

IoBeer 5 Dec 01, 2022
Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

MUSIQ: Multi-Scale Image Quality Transformer Unofficial pytorch implementation of the paper "MUSIQ: Multi-Scale Image Quality Transformer" (paper link

41 Jan 02, 2023
nextPARS, a novel Illumina-based implementation of in-vitro parallel probing of RNA structures.

nextPARS, a novel Illumina-based implementation of in-vitro parallel probing of RNA structures. Here you will find the scripts necessary to produce th

Jesse Willis 0 Jan 20, 2022
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

CNN-Filter-DB An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters Paul Gavrikov, Janis Keuper Paper: htt

Paul Gavrikov 18 Dec 30, 2022
A chemical analysis of lipophilicities & molecule drawings including ML

A chemical analysis of lipophilicity & molecule drawings including a bit of ML analysis. This is a simple project that includes two Jupyter files (one

Aurimas A. NausÄ—das 7 Nov 22, 2022
Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling Code for the paper: Greg Ver Steeg and Aram Galstyan. "Hamiltonian Dynamics with N

Greg Ver Steeg 25 Mar 14, 2022
[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

mmTransformer Introduction This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented

DeciForce: Crossroads of Machine Perception and Autonomy 232 Dec 31, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
Neural network chess engine trained on Gary Kasparov's games.

Neural Chess It's not the best chess engine, but it is a chess engine. Proof of concept neural network chess engine (feed-forward multi-layer perceptr

3 Jun 22, 2022
Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

Viewpoint Invariant Dense Matching for Visual Geolocalization: PyTorch implementation This is the implementation of the ICCV21 paper: G Berton, C. Mas

Gabriele Berton 44 Jan 03, 2023
An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit

Pengfei Liu 388 Dec 18, 2022
Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021) Overview of paths used in DIG and IG. w is the word being attributed. The

INK Lab @ USC 17 Oct 27, 2022
Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

Rishabh Anand 46 Dec 07, 2022
Python Multi-Agent Reinforcement Learning framework

- Please pay attention to the version of SC2 you are using for your experiments. - Performance is *not* always comparable between versions. - The re

whirl 1.3k Jan 05, 2023
Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Deep-RTC [project page] This repository contains the source code accompanying our ECCV 2020 paper. Solving Long-tailed Recognition with Deep Realistic

Gina Wu 16 May 26, 2022