Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Last update: Sep 16, 2022

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Official implementation of ACC, described in the paper "Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning". The source code is based on the pytorch implementation of TQC, which again is based on TD3. We thank the authors for making their source code publicly available.

Requirements

Install MuJoCo

Download and install MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:

Install

We recommend to use an anaconda environment. In our experiments we used python 3.7 and the following dependencies

pip install gym==0.17.2 mujoco-py==1.50.1.68 numpy==1.19.1 torch==1.6.0 torchvision==0.7.0

Running ACC

You can run ACC for TQC on one of the gym continuous control environments by calling

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --seed 0

To run the data efficient variant with 4 critic update steps per environment step you can call

python main.py --env "HalfCheetah-v3" --max_timesteps 1000000 --num_critic_updates 4 --seed 0

An example script that runs the experiments for 10 seeds and all environments is in run_experiment.sh and run_experiment_data_efficient.sh.

You can speed up the experiments by using fewer networks in the ensemble of TQC. This trades off a little bit of performance for a faster runtime (see the Appendix of the paper). The number of networks can be controlled with the flag --n_nets. For example

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --n_nets 2--seed 0

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Requirements

Install MuJoCo

Install

Running ACC

Owner

Yolo Traffic Light Detection With Python

A scikit-learn-compatible module for estimating prediction intervals.

A blender add-on that automatically re-aligns wrong axis objects.

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

SegNet-like Autoencoders in TensorFlow

A working implementation of the Categorical DQN (Distributional RL).

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

Transport Mode detection - can detect the mode of transport with the help of features such as acceeration,jerk etc

The Power of Scale for Parameter-Efficient Prompt Tuning

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

ML powered analytics engine for outlier detection and root cause analysis.

EXplainable Artificial Intelligence (XAI)

Automatic Image Background Subtraction

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

Turning pixels into virtual points for multimodal 3D object detection.

Randomized Correspondence Algorithm for Structural Image Editing

HistoKT: Cross Knowledge Transfer in Computational Pathology

2D&3D human pose estimation

This is a collection of our NAS and Vision Transformer work.