Inferring Lexicographically-Ordered Rewards from Preferences

Code author: Alihan Hüyük ([email protected])

This repository contains the source code necessary to replicate the main experimental results in the AAAI 2022 paper "Inferring Lexicographically-Ordered Reward from Preferences." Our proposed method, LORI, is implemented in files src/main-lori.py and src/main-lori-liver.py for the problem settings considered in the paper: cancer treatment and organ transplantation respectively.

Usage

First, install the required python packages by running:

    python -m pip install -r requirements.txt

Then, the experiments in the paper can be replicated by running:

    ./src/run.sh        # generates the results in Tables 2 and 3
    ./src/run-liver.sh  # generates the reward functions in (10) and (11)

Note that, in order to run the experiments for the transplantation setting, you need to get access to the Organ Procurement and Transplantation Network (OPTN) dataset for liver transplantations as of December 4, 2020.

Citing

If you use this software please cite as follows:

@inproceedings{huyuk2022inferring,
  author={Alihan H\"uy\"uk and William R. Zame and Mihaela van der Schaar},
  title={Inferring lexicographically-ordered rewards from preferences},
  booktitle={Proceedings of the 36th AAAI Conference on Artificial Intelligence},
  year={2022}
}

Inferring Lexicographically-Ordered Rewards from Preferences

Related tags

Overview

Inferring Lexicographically-Ordered Rewards from Preferences

Usage

Citing

Owner

Alihan Hüyük

A self-supervised learning framework for audio-visual speech

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation (CIKM'17)

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

This is a pytorch implementation of the NeurIPS paper GAN Memory with No Forgetting.

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

render sprites into your desktop environment as shaped windows using GTK

Network Compression via Central Filter

Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)

Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

sktime companion package for deep learning based on TensorFlow

The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Revisiting Global Statistics Aggregation for Improving Image Restoration