OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Last update: Nov 25, 2022

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

We here provide a video demo from confounded Enduro environment (see Figure 8 of the main draft). We also visualize the spatial attention map from a convolutional encoder trained with BC (medium) and OREO (right).

Installation

OREO requires CUDA 10.1 to run.

Install the dependencies:

conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install dopamine_rl sklearn tqdm kornia dropblock atari-py==0.2.6 gsutil

Download DQN Replay dataset for expert demonstrations on Atari environments:

mkdir DATAPATH
cp download.sh DATAPATH
cd DATAPATH
sh download.sh

Pre-training

We here provide beta-VAE (for CCIL) and VQ-VAE (for CRLR and OREO) pretraining scripts. For other datasets, change the --env option.

beta-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_beta_vae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1 --ch_div 4 --lmd 10

VQ-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_vqvae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1

Training BC policy

We here provide training scripts for baselines and OREO. For other datasets, change the --env, --beta_vae_path, and --vqvae_path options.

Behavioral cloning

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --num_episodes 20 --num_eval_episodes 100

Dropout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --original_dropout --prob 0.5 --num_episodes 20 --num_eval_episodes 100

DropBlock

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --dropblock --prob 0.3 --num_episodes 20 --num_eval_episodes 100

Cutout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --input_cutout --num_episodes 20 --num_eval_episodes 100

RandomShift

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --random_shift --num_episodes 20 --num_eval_episodes 100

CCIL (w/o interaction)

CUDA_VISIBLE_DEVICES=0 python atari_beta_vae_actor.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --ch_div 4 --beta_vae_path models_beta_vae_coord_conv_chdiv4_actor_lmd10.0/KungFuMaster_s1_epi20_con1_seed1_zdim50_beta4_kltol0_ep1000_beta_vae.pth

CRLR

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor_crlr.py --fixed_size 15000 --num_sub_iters 10 --eval_interval 10 --save_interval 10 --n_epochs 10 --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO

CUDA_VISIBLE_DEVICES=0 python atari_vqvae_oreo.py --env=KungFuMaster --datapath DATAPATH --num_mask 5 --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

Installation

Pre-training

beta-VAE

VQ-VAE

Training BC policy

Behavioral cloning

Dropout

DropBlock

Cutout

RandomShift

CCIL (w/o interaction)

CRLR

OREO

Owner

The AWS Certified SysOps Administrator

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

The original implementation of TNDM used in the NeurIPS 2021 paper (no longer being updated)

Large-scale language modeling tutorials with PyTorch

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

Implementation of Convolutional enhanced image Transformer

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

Groceries ARL: Association Rules (Birliktelik Kuralı)

A simple Rock-Paper-Scissors game using CV in python

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

Pytorch library for seismic data augmentation

A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)