Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Last update: Dec 07, 2022

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

This repository contains the code to replicate the synthetic experiment conducted in the paper "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model" by Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto, which has been accepted to WSDM2022.

If you find this code useful in your research then please site:

@inproceedings{kiyohara2022doubly,
  author = {Kiyohara, Haruka and Saito, Yuta and Matsuhiro, Tatsuya and Narita, Yusuke and Shimizu, Nobuyuki and Yamamoto, Yasuo},
  title = {Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model},
  booktitle = {Proceedings of the 15th International Conference on Web Search and Data Mining},
  pages = {xxx--xxx},
  year = {2022},
}

Dependencies

This repository supports Python 3.7 or newer.

numpy==1.20.0
pandas==1.2.1
scikit-learn==0.24.1
matplotlib==3.4.3
obp==0.5.2
hydra-core==1.0.6

Note that the proposed Cascade-DR estimator is implemented in Open Bandit Pipeline (obp.ope.SlateCascadeDoublyRobust).

Running the code

To conduct the synthetic experiment, run the following commands.

(i) run OPE simulations with varying data size, with the fixed slate size.

python src/main.py setting=n_rounds

(ii), (iii) run OPE simulations with varying slate size and policy similarities, with the fixed data size.

python src/main.py

Once the code is finished executing, you can find the results (squared_error.csv, relative_ee.csv, configuration.csv) in the ./logs/ directory. Lower value is better for squared error and relative estimation error (relative-ee).

Visualize the results

To visualize the results, run the following commands. Make sure that you have executed the above two experiments (by running python src/main.py and python src/main.py setting=default) before visualizing the results.

python src/visualize.py

Then, you will find the following figures (slate size (standard/cascade/independent).png, evaluation policy similarity (standard/cascade/independent).png, data size (standard/cascade/independent).png) in the ./logs/ directory. Lower value is better for the relative-MSE (y-axis).

reward structure	Standard	Cascade	Independent
varying data size (n)
varying slate size (L)
varying evaluation policy similarity (λ)

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Related tags

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

Dependencies

Running the code

Visualize the results

Owner

Haruka Kiyohara

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers.

A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Official implementation of Sparse Transformer-based Action Recognition

Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

Interpretation of T cell states using reference single-cell atlases

Supervised forecasting of sequential data in Python.

Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"

Fast EMD for Python: a wrapper for Pele and Werman's C++ implementation of the Earth Mover's Distance metric

Few-Shot Object Detection via Association and DIscrimination

A python3 tool to take a 360 degree survey of the RF spectrum (hamlib + rotctld + RTL-SDR/HackRF)

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Unsupervised Image-to-Image Translation

[ICML 2020] "When Does Self-Supervision Help Graph Convolutional Networks?" by Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen

(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

An Unsupervised Detection Framework for Chinese Jargons in the Darknet

Speech Recognition using DeepSpeech2.

(CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"

Human Action Controller - A human action controller running on different platforms.

CVAT is free, online, interactive video and image annotation tool for computer vision