(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

Last update: Dec 01, 2022

Overview

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

About

This repository accompanies the real-world experiments conducted in the paper "Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback" by Yuta Saito, which has been accepted at SIGIR2020 as a full paper.

If you find this code useful in your research then please cite:

@inproceedings{saito2020asymmetric,
  title={Asymmetric tri-training for debiasing missing-not-at-random explicit feedback},
  author={Saito, Yuta},
  booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  year={2020}
}

Dependencies

numpy==1.17.2
pandas==0.25.1
scikit-learn==0.22.1
tensorflow==1.15.2
optuna==0.17.0
pyyaml==5.1.2

Running the code

To run the simulation with real-world datasets,

download the Coat dataset from https://www.cs.cornell.edu/~schnabts/mnar/ and put train.ascii and test.ascii files into ./data/coat/ directory.
download the Yahoo! R3 dataset from https://webscope.sandbox.yahoo.com/catalog.php?datatype=r and put train.txt and test.txt files into ./data/yahoo/ directory.

Then, run the following commands in the ./src/ directory:

for the MF-IPS models without asymmetric tri-training

for data in yahoo coat
do
  for model in uniform user item both nb nb_true
  do
    python main.py -d $data -m $model
  done
done

for the MF-IPS models with asymmetric tri-training (our proposal)

for data in coat yahoo
do
  for model in uniform-at user-at item-at both-at nb-at nb_true-at
  do
    python main.py -d $data -m $model
  done
done

where (uniform, user, item, both, nb, nb_true) correspond to (uniform propenisty, user propensity, item propensity, user-item propensity, NB (uniform), NB (true)), respectively.

These commands will run simulations with real-world datasets conducted in Section 5. The tuned hyperparameters for all models can be found in ./hyper_params.yaml.
(By adding the -t option to the above code, you can re-run the hyperparameter tuning procedure by Optuna.)

Once the simulations have finished running, the summarized results can be obtained by running the following command in the ./src/ directory:

python summarize_results -d coat yahoo

This creates ./paper_results/.

(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

Related tags

Overview

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

About

Dependencies

Running the code

Owner

yuta-saito

A PyTorch-based library for fast prototyping and sharing of deep neural network models.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

Eff video representation - Efficient video representation through neural fields

TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020)

TensorFlow-based implementation of "Pyramid Scene Parsing Network".

a Lightweight library for sequential learning agents, including reinforcement learning

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Code for the AI lab course 2021/2022 of the University of Verona

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

Self-supervised learning optimally robust representations for domain generalization.

[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

T2F: text to face generation using Deep Learning

Source code for our CVPR 2019 paper - PPGNet: Learning Point-Pair Graph for Line Segment Detection

NovelD: A Simple yet Effective Exploration Criterion