Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Overview

Open in Colab

Language Models as Zero-Shot Planners:
Extracting Actionable Knowledge for Embodied Agents

[Project Page] [Paper] [Video]

Wenlong Huang1, Pieter Abbeel1, Deepak Pathak*2, Igor Mordatch*3 (*equal advising)

1University of California, Berkeley, 2Carnegie Mellon University, 3Google Brain

This is the official demo code for our Language Models as Zero-Shot Planners paper. The code demonstrates how Large Language Models, such as GPT-3 and Codex, can generate action plans for complex human activities (e.g. "make breakfast"), even without any further training. The code can be used with any available language models from OpenAI API and Huggingface Transformers with a common interface.

If you find this work useful in your research, please cite using the following BibTeX:

@article{huang2022language,
      title={Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents},
      author={Huang, Wenlong and Abbeel, Pieter and Pathak, Deepak and Mordatch, Igor},
      journal={arXiv preprint arXiv:2201.07207},
      year={2022}
    }

Local Setup or Open in Colab

Requirements

  • Python=3.6.13
  • CUDA=11.3

Setup Instructions

git clone https://github.com/huangwl18/language-planner.git
cd language-planner/
conda create --name language-planner-env python=3.6.13
conda activate language-planner-env
pip install --upgrade pip
pip install -r requirements.txt

Running Code

See demo.ipynb (or Open in Colab) for a complete walk-through of our method. Feel free to experiment with any household tasks that you come up with (or any tasks beyond household domain if you provide necessary actions in available_actions.json)!

Note:

  • It is observed that best results can be obtained with larger language models. If you cannot run Huggingface Transformers models locally or on Google Colab due to memory constraint, it is recommended to register an OpenAI API account and use GPT-3 or Codex (As of 01/2022, $18 free credits are awarded to new accounts and Codex series are free after admitted from the waitlist).
  • Due to language models' high sensitivity to sampling hyperparameters, you may need to tune sampling hyperparameters for different models to obtain the best results.
  • The code uses the list of available actions supported in VirtualHome 1.0's Evolving Graph Simulator. The available actions are stored in available_actions.json. The actions should support a large variety of household tasks. However, you may modify or replace this file if you're interested in a different set of actions or a different domain of tasks (beyond household domain).
  • A subset of the manually-annotated examples originally collected by the VirtualHome paper is used as available examples in the prompt. They are transformed to natural language format and stored in available_examples.json. Feel free to change this file for a different set of available examples.
Owner
Wenlong Huang
Undergraduate Student @ UC Berkeley
Wenlong Huang
The Classical Language Toolkit

Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt

Classical Language Toolkit 754 Jan 09, 2023
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short

Natural Language Processing @UCLA 464 Jan 04, 2023
To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Ragesh Hajela 0 Feb 08, 2022
Text vectorization tool to outperform TFIDF for classification tasks

WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth

186 Dec 29, 2022
APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

APEACH - Korean Hate Speech Evaluation Datasets APEACH is the first crowd-generated Korean evaluation dataset for hate speech detection. Sentences of

Kevin-Yang 70 Dec 06, 2022
Semantic search for quotes.

squote A semantic search engine that takes some input text and returns some (questionably) relevant (questionably) famous quotes. Built with: bert-as-

cjwallace 11 Jun 25, 2022
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

DevNugget 3 Jul 28, 2021
A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

Reddit text to speech generator A basic reddit tts video generator Current functionality Generate videos for subs based on comments,(askreddit) so rea

Aadvik 17 Dec 19, 2022
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Rebiber: A tool for normalizing bibtex with official info. We often cite papers using their arXiv versions without noting that they are already PUBLIS

(Bill) Yuchen Lin 2k Jan 01, 2023
Simple and efficient RevNet-Library with DeepSpeed support

RevLib Simple and efficient RevNet-Library with DeepSpeed support Features Half the constant memory usage and faster than RevNet libraries Less memory

Lucas Nestler 112 Dec 05, 2022
Text editor on python tkinter to convert english text to other languages with the help of ployglot.

Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan

Merin Rose Tom 1 Jan 16, 2022
ETM - R package for Topic Modelling in Embedding Spaces

ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM

bnosac 37 Nov 06, 2022
This repository has a implementations of data augmentation for NLP for Japanese.

daaja This repository has a implementations of data augmentation for NLP for Japanese: EDA: Easy Data Augmentation Techniques for Boosting Performance

Koga Kobayashi 60 Nov 11, 2022
Model for recasing and repunctuating ASR transcripts

Recasing and punctuation model based on Bert Benoit Favre 2021 This system converts a sequence of lowercase tokens without punctuation to a sequence o

Benoit Favre 88 Dec 29, 2022
Shared code for training sentence embeddings with Flax / JAX

flax-sentence-embeddings This repository will be used to share code for the Flax / JAX community event to train sentence embeddings on 1B+ training pa

Nils Reimers 23 Dec 30, 2022
Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

Seq2Seq Speech in JAX A JAX/Flax repository for combining a pre-trained speech encoder model (e.g. Wav2Vec2, HuBERT, WavLM) with a pre-trained text de

Sanchit Gandhi 21 Dec 14, 2022
EMNLP 2021 paper "Pre-train or Annotate? Domain Adaptation with a Constrained Budget".

Pre-train or Annotate? Domain Adaptation with a Constrained Budget This repo contains code and data associated with EMNLP 2021 paper "Pre-train or Ann

Fan Bai 8 Dec 17, 2021
pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Python binding for Morfologik Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/

Damian Mirecki 18 Dec 29, 2021
Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Combo List Fixer A simple python code to fix your combo list by removing any text after a separator or removing duplicate combos Removing any text aft

Hamidreza Dehghan 3 Dec 05, 2022