Exploration-Exploitation Dilemma Solving Methods

Medium article for this repo - HERE

In ths repo I implemented two techniques for tackling mentioned tradeoff. Methods Include:-

Epsilon Greedy (With different epsilons)
Thompson Sampling(also known as posterior sampling)

The reason for choosing these two only is to show the upper and lower bounds as epsilons are a starting point in dealing with these tradeoffs and Thompson Sampling is considered a recent state of the Art in this field.

ENV SPECIFICATIONS - A 10 arm testbed is simulated as same demonstrated in Sutton-Barto Book.
True Reward distribution (Here Action-2 is best)

Comparison Greedy(or Epsilon Greedies and TS

we used three different epsilons here for testing i.e:

epsilon = 0 => Greedy Agent
epsilon = 0.01 => exploration with 1% probability
epsilon = 0.1 => exploration with 10% probability

and TS

Averaged Over 2500 independent runs with 1500 timesteps

Comparison

Percentage Actions selected for epsilon = 0.01 and TS

Conclusion -> epsilon = 0.01 can be considered best for eps-greedies as it is increasing but pretty slow and the percentage Optimal Actions for it is Around 80% in later stages, on the other hand Thomsan Sampling shows a significant improvement in these results as it quickly explores and then exploit the optimal one with percentage goes upto almost 100 even very early!!.

In case you want to know more about TS visit this Reference.

Exploration-Exploitation Dilemma Solving Methods

Related tags

Overview

Exploration-Exploitation Dilemma Solving Methods

Comparison Greedy(or Epsilon Greedies and TS

Owner

Aman Mishra

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Random Forests for Regression with Missing Entries

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

Dense Gaussian Processes for Few-Shot Segmentation

Research using Cirq!

Automatically download the cwru data set, and then divide it into training data set and test data set

Python package for multiple object tracking research with focus on laboratory animals tracking.

All supplementary material used by me while TA-ing CS3244: Machine Learning

A high-level Python library for Quantum Natural Language Processing

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Torch code for our CVPR 2018 paper "Residual Dense Network for Image Super-Resolution" (Spotlight)

StorSeismic: An approach to pre-train a neural network to store seismic data features

A deep learning model for style-specific music generation.

NHL 94 AI contests