Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Overview

Ravens - Transporter Networks

Ravens is a collection of simulated tasks in PyBullet for learning vision-based robotic manipulation, with emphasis on pick and place. It features a Gym-like API with 10 tabletop rearrangement tasks, each with (i) a scripted oracle that provides expert demonstrations (for imitation learning), and (ii) reward functions that provide partial credit (for reinforcement learning).


(a) block-insertion: pick up the L-shaped red block and place it into the L-shaped fixture.
(b) place-red-in-green: pick up the red blocks and place them into the green bowls amidst other objects.
(c) towers-of-hanoi: sequentially move disks from one tower to another—only smaller disks can be on top of larger ones.
(d) align-box-corner: pick up the randomly sized box and align one of its corners to the L-shaped marker on the tabletop.
(e) stack-block-pyramid: sequentially stack 6 blocks into a pyramid of 3-2-1 with rainbow colored ordering.
(f) palletizing-boxes: pick up homogeneous fixed-sized boxes and stack them in transposed layers on the pallet.
(g) assembling-kits: pick up different objects and arrange them on a board marked with corresponding silhouettes.
(h) packing-boxes: pick up randomly sized boxes and place them tightly into a container.
(i) manipulating-rope: rearrange a deformable rope such that it connects the two endpoints of a 3-sided square.
(j) sweeping-piles: push piles of small objects into a target goal zone marked on the tabletop.

Some tasks require generalizing to unseen objects (d,g,h), or multi-step sequencing with closed-loop feedback (c,e,f,h,i,j).

Team: this repository is developed and maintained by Andy Zeng, Pete Florence, Daniel Seita, Jonathan Tompson, and Ayzaan Wahid. This is the reference repository for the paper:

Transporter Networks: Rearranging the Visual World for Robotic Manipulation

Project Website  •  PDF  •  Conference on Robot Learning (CoRL) 2020

Andy Zeng, Pete Florence, Jonathan Tompson, Stefan Welker, Jonathan Chien, Maria Attarian, Travis Armstrong,
Ivan Krasin, Dan Duong, Vikas Sindhwani, Johnny Lee

Abstract. Robotic manipulation can be formulated as inducing a sequence of spatial displacements: where the space being moved can encompass an object, part of an object, or end effector. In this work, we propose the Transporter Network, a simple model architecture that rearranges deep features to infer spatial displacements from visual input—which can parameterize robot actions. It makes no assumptions of objectness (e.g. canonical poses, models, or keypoints), it exploits spatial symmetries, and is orders of magnitude more sample efficient than our benchmarked alternatives in learning vision-based manipulation tasks: from stacking a pyramid of blocks, to assembling kits with unseen objects; from manipulating deformable ropes, to pushing piles of small objects with closed-loop feedback. Our method can represent complex multi-modal policy distributions and generalizes to multi-step sequential tasks, as well as 6DoF pick-and-place. Experiments on 10 simulated tasks show that it learns faster and generalizes better than a variety of end-to-end baselines, including policies that use ground-truth object poses. We validate our methods with hardware in the real world.

Installation

Step 1. Recommended: install Miniconda with Python 3.7.

curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -u
echo $'\nexport PATH=~/miniconda3/bin:"${PATH}"\n' >> ~/.profile  # Add Conda to PATH.
source ~/.profile
conda init

Step 2. Create and activate Conda environment, then install GCC and Python packages.

cd ~/ravens
conda create --name ravens python=3.7 -y
conda activate ravens
sudo apt-get update
sudo apt-get -y install gcc libgl1-mesa-dev
pip install -r requirements.txt
python setup.py install --user

Step 3. Recommended: install GPU acceleration with NVIDIA CUDA 10.1 and cuDNN 7.6.5 for Tensorflow.

./oss_scipts/install_cuda.sh  #  For Ubuntu 16.04 and 18.04.
conda install cudatoolkit==10.1.243 -y
conda install cudnn==7.6.5 -y

Alternative: Pure pip

As an example for Ubuntu 18.04:

./oss_scipts/install_cuda.sh  #  For Ubuntu 16.04 and 18.04.
sudo apt install gcc libgl1-mesa-dev python3.8-venv
python3.8 -m venv ./venv
source ./venv/bin/activate
pip install -U pip
pip install scikit-build
pip install -r ./requirements.txt
export PYTHONPATH=${PWD}

Getting Started

Step 1. Generate training and testing data (saved locally). Note: remove --disp for headless mode.

python ravens/demos.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --mode=train --n=10
python ravens/demos.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --mode=test --n=100

To run with shared memory, open a separate terminal window and run python3 -m pybullet_utils.runServer. Then add --shared_memory flag to the command above.

Step 2. Train a model e.g., Transporter Networks model. Model checkpoints are saved to the checkpoints directory. Optional: you may exit training prematurely after 1000 iterations to skip to the next step.

python ravens/train.py --task=block-insertion --agent=transporter --n_demos=10

Step 3. Evaluate a Transporter Networks agent using the model trained for 1000 iterations. Results are saved locally into .pkl files.

python ravens/test.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --agent=transporter --n_demos=10 --n_steps=1000

Step 4. Plot and print results.

python ravens/plot.py --disp=True --task=block-insertion --agent=transporter --n_demos=10

Optional. Track training and validation losses with Tensorboard.

python -m tensorboard.main --logdir=logs  # Open the browser to where it tells you to.

Datasets and Pre-Trained Models

Download our generated train and test datasets and pre-trained models.

wget https://storage.googleapis.com/ravens-assets/checkpoints.zip
wget https://storage.googleapis.com/ravens-assets/block-insertion.zip
wget https://storage.googleapis.com/ravens-assets/place-red-in-green.zip
wget https://storage.googleapis.com/ravens-assets/towers-of-hanoi.zip
wget https://storage.googleapis.com/ravens-assets/align-box-corner.zip
wget https://storage.googleapis.com/ravens-assets/stack-block-pyramid.zip
wget https://storage.googleapis.com/ravens-assets/palletizing-boxes.zip
wget https://storage.googleapis.com/ravens-assets/assembling-kits.zip
wget https://storage.googleapis.com/ravens-assets/packing-boxes.zip
wget https://storage.googleapis.com/ravens-assets/manipulating-rope.zip
wget https://storage.googleapis.com/ravens-assets/sweeping-piles.zip

The MDP formulation for each task uses transitions with the following structure:

Observations: raw RGB-D images and camera parameters (pose and intrinsics).

Actions: a primitive function (to be called by the robot) and parameters.

Rewards: total sum of rewards for a successful episode should be =1.

Info: 6D poses, sizes, and colors of objects.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

Facebook Research 1.2k Jan 09, 2023
PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

snn-localization repo PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch. Install Dependencies Orig

Sami BARCHID 1 Jan 06, 2022
MINOS: Multimodal Indoor Simulator

MINOS Simulator MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environ

194 Dec 27, 2022
TyXe: Pyro-based BNNs for Pytorch users

TyXe: Pyro-based BNNs for Pytorch users TyXe aims to simplify the process of turning Pytorch neural networks into Bayesian neural networks by leveragi

87 Jan 03, 2023
Recursive Bayesian Networks

Recursive Bayesian Networks This repository contains the code to reproduce the results from the NeurIPS 2021 paper Lieck R, Rohrmeier M (2021) Recursi

Robert Lieck 11 Oct 18, 2022
Transformer in Vision

Transformer-in-Vision Recent Transformer-based CV and related works. Welcome to comment/contribute! Keep updated. Resource SCENIC: A JAX Library for C

Yong-Lu Li 1.1k Dec 30, 2022
An Object Oriented Programming (OOP) interface for Ontology Web language (OWL) ontologies.

Enabling a developer to use Ontology Web Language (OWL) along with its reasoning capabilities in an Object Oriented Programming (OOP) paradigm, by pro

TheEngineRoom-UniGe 7 Sep 23, 2022
Self-describing JSON-RPC services made easy

ReflectRPC Self-describing JSON-RPC services made easy Contents What is ReflectRPC? Installation Features Datatypes Custom Datatypes Returning Errors

Andreas Heck 31 Jul 16, 2022
A concise but complete implementation of CLIP with various experimental improvements from recent papers

x-clip (wip) A concise but complete implementation of CLIP with various experimental improvements from recent papers Install $ pip install x-clip Usag

Phil Wang 515 Dec 26, 2022
Implementation of Monocular Direct Sparse Localization in a Prior 3D Surfel Map (DSL)

DSL Project page: https://sites.google.com/view/dsl-ram-lab/ Monocular Direct Sparse Localization in a Prior 3D Surfel Map Authors: Haoyang Ye, Huaiya

Haoyang Ye 93 Nov 30, 2022
A study project using the AA-RMVSNet to reconstruct buildings from multiple images

3d-building-reconstruction This is part of a study project using the AA-RMVSNet to reconstruct buildings from multiple images. Introduction It is exci

17 Oct 17, 2022
ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Sequence Generation with GANs trained by Gradient Estimation Requirements: PyTorch v0.3 Python 3.6 CUDA 9.1 (For GPU) Origin The idea is from paper Se

40 Nov 03, 2022
Using modified BiSeNet for face parsing in PyTorch

face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr

zll 1.6k Jan 08, 2023
FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale 40 Dec 13, 2022
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
Implementation for "Conditional entropy minimization principle for learning domain invariant representation features"

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features". The code is reproduced from thi

1 Nov 02, 2022
A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

TorchArrow (Warning: Unstable Prototype) This is a prototype library currently under heavy development. It does not currently have stable releases, an

Facebook Research 536 Jan 06, 2023
Robust fine-tuning of zero-shot models

Robust fine-tuning of zero-shot models This repository contains code for the paper Robust fine-tuning of zero-shot models by Mitchell Wortsman*, Gabri

224 Dec 29, 2022
A compendium of useful, interesting, inspirational usage of pandas functions, each example will be an ipynb file

Pandas_by_examples A compendium of useful/interesting/inspirational usage of pandas functions, each example will be an ipynb file What is this reposit

Guangyuan(Frank) Li 32 Nov 20, 2022
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset This repository contains links to data and code to fetch and reproduce

Daniel Varab 19 Dec 16, 2022