This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

Last update: Jul 14, 2022

Related tags

Deep Learning irgail_example

Overview

IR-GAIL

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

Dependency

The experiments are dependent on gym, mujoco-py, and torch. Make sure you have installed them properly.

Get Started

Train the agent with pretrained representation model

The example folder contains a quick demo on the InvertedDoublePendulum environment.

To train a CDIL agent with pretrained representation network (in extrapolation mode), run the following command:

python ./example/idp.py --cuda --c1 1.3 --c2 1.5 --c3 1.4 --rollout_length 5000 --eval_interval 5000 --num_steps 5000000 --buffer ./assets/idp_expert_buffer.pth --embedding ./assets/idp_pretrained.pth --seed 0

In the above command, --c1 1.3 --c2 1.5 --c3 1.4 specifies the environment parameter (1.3, 1.5, 1.4). The parameters of experts are around (0.9, 0.9, 0.9). You can change these parameters to do interpolation as well.

--buffer ./assets/idp_expert_buffer.pth specifies the expert demonstration.

--embedding ./assets/idp_pretrained.pth specifies the pretrained representation network.

The return for this example will converge after 600k steps.

Train the representation model

The random and expert experience used to train the representation network are in the ./assets/idp_expert_random_buffer.pkl.

If you want to train the representation network by yourself, run the following command:

python ./example/idp_train_representation.py

It will create a representation_logs folder, in which you can find the latest model as training goes on.

You can then use the trained model for imitation learning.

Ablation

Navigate to the ./example/idp_train_representation.py, and disable the dynamics loss by changing the c_f value to 0.0. Then, train the representation model and the agent again.

This time, you will find the agent fail in the extrapolation experiment!

Acknowledgement

We gratefully thank ku2482 for a neat imitation learning framework. 🙂

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

Related tags

Overview

IR-GAIL

Dependency

Get Started

Train the agent with pretrained representation model

Train the representation model

Ablation

Acknowledgement

Owner

Zhao-Heng Yin

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Processed, version controlled history of Minecraft's generated data and assets

Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

Energy consumption estimation utilities for Jetson-based platforms

Standalone pre-training recipe with JAX+Flax

Empower Sequence Labeling with Task-Aware Language Model

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

Library to enable Bayesian active learning in your research or labeling work.

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

A model which classifies reviews as positive or negative.

iBOT: Image BERT Pre-Training with Online Tokenizer

OpenLT: An open-source project for long-tail classification

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

HAT: Hierarchical Aggregation Transformers for Person Re-identification

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.