Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Last update: Dec 01, 2022

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

The code was implemented in Python 3.6 and the following packages are needed for running it:

gym==0.17.2
numpy==1.18.2
pandas==1.0.4
tensorflow==1.15.0
torch==1.6.0
tqdm==4.32.1
scipy==1.1.0
scikit-learn==0.22.2
stable-baselines==2.10.1

Running and evaluating the model:

The control tasks used for experiments are from OpenAI gym [1]. Each control task is associated with a true reward function (unknown to the imitation algorithm). In each case, the “expert” demonstrator can be obtained by using a pre-trained and hyperparameter-optimized agent from the RL Baselines Zoo [2] in Stable OpenAI Baselines [3].

In this implementation we provide the expert demonstrations for 2 environments for CartPole-v1 in 'volume/CartPole-v1'. Note that the code in 'contrib/baselines_zoo' was taken from [2].

To train and evaluate ICIL on CartPole-v1, run the following command with the chosen command line arguments. For reference, the expert performance is 500.

python testing/il.py

Options :
   --env                  # Environment name. 
   --num_trajectories	  # Number of expert trajectories used for training the imitation learning algorithm. 
   --trial                # Trial number.

Outputs:

Average reward for 10 repetitions of running ICIL.

Example usage

python testing/il.py  --env='CartPole-v1' --num_trajectories=20 --trial=0

References

[1] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. OpenAI, 2016

[2] Antonin Raffin. Rl baselines zoo. https://github.com/araffin/rl-baselines-zoo, 2018

[3] Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. Stable baselines. https://github.com/hill-a/stable-baselines, 2018.

Citation

If you use this code, please cite:

@inproceedings{bica2021invariant,
  title={Invariant Causal Imitation Learning for Generalizable Policies},
  author={Bica, Ioana and Jarrett, Daniel and van der Schaar, Mihaela},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Related tags

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

Running and evaluating the model:

Example usage

References

Citation

Owner

Ioana Bica

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

Action Recognition for Self-Driving Cars

Python script that allows you to automatically setup your Growtopia server.

Generate high quality pictures. GAN. Generative Adversarial Networks

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

Implements Stacked-RNN in numpy and torch with manual forward and backward functions

AlphaBot2 Pi Core software for interfacing with the various components.

Library for 8-bit optimizers and quantization routines.

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

The official repository for Deep Image Matting with Flexible Guidance Input

An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

Machine learning, in numpy

Repository for Driving Style Recognition algorithms for Autonomous Vehicles

You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors

[ICLR'21] Counterfactual Generative Networks

The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Fast and simple implementation of RL algorithms, designed to run fully on GPU.