Author's PyTorch implementation of TD3 for OpenAI gym tasks

Related tags

Deep LearningTD3
Overview

Addressing Function Approximation Error in Actor-Critic Methods

PyTorch implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3). If you use our code or data please cite the paper.

Method is tested on MuJoCo continuous control tasks in OpenAI gym. Networks are trained using PyTorch 1.2 and Python 3.7.

Usage

The paper results can be reproduced by running:

./run_experiments.sh

Experiments on single environments can be run by calling:

python main.py --env HalfCheetah-v2

Hyper-parameters can be modified with different arguments to main.py. We include an implementation of DDPG (DDPG.py), which is not used in the paper, for easy comparison of hyper-parameters with TD3. This is not the implementation of "Our DDPG" as used in the paper (see OurDDPG.py).

Algorithms which TD3 compares against (PPO, TRPO, ACKTR, DDPG) can be found at OpenAI baselines repository.

Results

Code is no longer exactly representative of the code used in the paper. Minor adjustments to hyperparamters, etc, to improve performance. Learning curves are still the original results found in the paper.

Learning curves found in the paper are found under /learning_curves. Each learning curve are formatted as NumPy arrays of 201 evaluations (201,), where each evaluation corresponds to the average total reward from running the policy for 10 episodes with no exploration. The first evaluation is the randomly initialized policy network (unused in the paper). Evaluations are peformed every 5000 time steps, over a total of 1 million time steps.

Numerical results can be found in the paper, or from the learning curves. Video of the learned agent can be found here.

Bibtex

@inproceedings{fujimoto2018addressing,
  title={Addressing Function Approximation Error in Actor-Critic Methods},
  author={Fujimoto, Scott and Hoof, Herke and Meger, David},
  booktitle={International Conference on Machine Learning},
  pages={1582--1591},
  year={2018}
}
Owner
Scott Fujimoto
Scott Fujimoto
PyTorch implementations of algorithms for density estimation

pytorch-flows A PyTorch implementations of Masked Autoregressive Flow and some other invertible transformations from Glow: Generative Flow with Invert

Ilya Kostrikov 546 Dec 05, 2022
An open-access benchmark and toolbox for electricity price forecasting

epftoolbox The epftoolbox is the first open-access library for driving research in electricity price forecasting. Its main goal is to make available a

97 Dec 05, 2022
Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

LMSOC: An Approach for Socially Sensitive Pretraining Code for reproducing the paper LMSOC: An Approach for Socially Sensitive Pretraining to appear a

Twitter Research 11 Dec 20, 2022
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022
CMT: Convolutional Neural Networks Meet Vision Transformers

CMT: Convolutional Neural Networks Meet Vision Transformers [arxiv] 1. Introduction This repo is the CMT model which impelement with pytorch, no refer

FlyEgle 83 Dec 30, 2022
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Super Resolution Examples We run this script under TensorFlow 2.0 and the TensorLayer2.0+. For TensorLayer 1.4 version, please check release. 🚀 🚀 🚀

TensorLayer Community 2.9k Jan 08, 2023
Differentiable simulation for system identification and visuomotor control

gradsim gradSim: Differentiable simulation for system identification and visuomotor control gradSim is a unified differentiable rendering and multiphy

105 Dec 18, 2022
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

Logo by Zhuoning Yuan LibAUC: A Machine Learning Library for AUC Optimization Website | Updates | Installation | Tutorial | Research | Github LibAUC a

Optimization for AI 176 Jan 07, 2023
Full-featured Decision Trees and Random Forests learner.

CID3 This is a full-featured Decision Trees and Random Forests learner. It can save trees or forests to disk for later use. It is possible to query tr

Alejandro Penate-Diaz 3 Aug 15, 2022
Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Sc

2 Dec 26, 2021
LBBA-boosted WSOD

LBBA-boosted WSOD Summary Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN Sincerely thanks for your resources. Newer version of our code

Martin Dong 20 Sep 19, 2022
Gradient Step Denoiser for convergent Plug-and-Play

Source code for the paper "Gradient Step Denoiser for convergent Plug-and-Play"

Samuel Hurault 11 Sep 17, 2022
Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

This is the official code for Multi-attribute Pizza Generator (MPG2): Cross-domain Attribute Control with Conditional StyleGAN. Paper Demo Setup Envir

Fangda Han 5 Sep 01, 2022
Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

Daniele Grattarola 37 Dec 13, 2022
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

Code accompanying the paper Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs (Chen et al., CVPR 2020, Oral).

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs This repository contains PyTorch implementation of our pa

Shizhe Chen 178 Dec 29, 2022
Implementation for "Conditional entropy minimization principle for learning domain invariant representation features"

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features". The code is reproduced from thi

1 Nov 02, 2022
SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

SciKit-Learn Laboratory This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. O

ETS 528 Nov 25, 2022
transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2:ImageNet无限制对抗攻击 介绍 : 深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

25 Dec 08, 2022