Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

HistoKT: Cross Knowledge Transfer in Computational Pathology

[NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Redash reset for python

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs.

Lab course materials for IEMBA 8/9 course "Coding and Artificial Intelligence"

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Image Segmentation and Object Detection in Pytorch

Dist2Dec: A Simplicial Neural Network for Homology Localization

Wordplay, an artificial Intelligence based crossword puzzle solver.

Feup-csr - Repository holding my group's submission to the CSR project competition

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

hipCaffe: the HIP port of Caffe

Code for "Learning to Regrasp by Learning to Place"

SeqAttack: a framework for adversarial attacks on token classification models

Code for "Steerable Pyramid Transform Enables Robust Left Ventricle Quantification"

wgan, wgan2(improved, gp), infogan, and dcgan implementation in lasagne, keras, pytorch

Implementation of the SUMO (Slim U-Net trained on MODA) model