GAN-based Matrix Factorization for Recommender Systems

Overview

GAN-based Matrix Factorization for Recommender Systems

GANMF architecture

This repository contains the datasets' splits, the source code of the experiments and their results for the paper "GAN-based Matrix Factorization for Recommender Systems" (arXiv: https://arxiv.org/abs/2201.08042) accepted at the 37th ACM/SIGAPP Symposium on Applied Computing (SAC '22).

How to use this repo

This repo is based on a version of Recsys_Course_AT_PoliMi. In order to run the code and experiments you need first to setup a Python environment. Any environment manager will work, but we suggest conda since it is easier to recreate our environment if using a GPU. conda can help with the installation of CUDA and CUDA toolkit necessary to utilize available GPU(s). We highly recommend running this repo with a GPU since GAN-based recommenders require long training times.

Conda

Run the following command to create a new environment with Python 3.6.8 and install all requirements in file conda_requirements.txt:

conda create -n <name-env> python==3.6.8 --file conda_requirements.txt

The file conda_requirements.txt also contains the packages cudatoolkit==9.0 and cudnn==7.1.2 which are installed completely separate from other versions you might already have installed and are managed by conda.

Install the following packages using pip inside the newly created environment since they are not found in the main channel of conda and conda-forge channel holds old versions of them:

pip install scikit-optimize==0.7.2 telegram-send==0.25

Activate the newly created environment:

conda activate <name-env>

Virtualenv & Pip

First download and install Python 3.6.8 from python.org. Then install virtualenv:

python -m pip install --user virtualenv

Now create a new environment with virtualenv (by default it will use the Python version it was installed with):

virtualenv <name-env> <path-to-new-env>

Activate the new environment with:

source <path-to-new-env>/bin/activate

Install the required packages through the file pip_requirements.txt:

pip install -r pip_requirements.txt

Note that if you intend to use a GPU and install required packages using virtualenv and pip then you need to install separately cudatoolkit==9.0 and cudnn==7.1.2 following instructions for your GPU on nvidia.com.

Before running any experiment or algorithm you need to compile the Cython code part of some of the recommenders. You can compile them all with the following command:

python run_compile_all_cython.py

N.B You need to have the following packages installed before compiling: gcc and python3-dev.

N.B Since the experiments can take a long time, the code notifies you on your Telegram account when the experiments start/end. Either configure telegram-send as indicated on https://pypi.org/project/telegram-send/#installation or delete the lines containing telegram-send inside RecSysExp.py.


Running experiments

All results presented in the paper are already provided in this repository. In case you want to re-run the experiments, below you can find the steps for each one of them.

Comparison with baselines1

In order to run all the comparisons with the baselines use the file RecSysExp.py. First compute for each dataset the 5 mutually exclusive sets:

  • Training set: once best hyperparameters of the recommender are found, it will be finally trained with this set.

    • Training set small: the recommender is first trained on this small training set with the aim of finding the best hyperparameters.
    • Early stopping set: validation set used to incorporate early stopping in the hyperparameters tuning.
    • Validation set: the recommender with the current hyperparameter values is tested against this set.
  • Test set: once the best hyperparameters are found, the recommender is finally tested with this set. The results presented are the ones on this set.

Compute the splits for each dataset with the following command:

python RecSysExp.py --build-dataset <dataset-name>

To run the tuning of a recommender use the following command:

python RecSysExp.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>] 
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
  • --user or --item is a flag used only for GAN-based recommenders. It denotes the user/item-based training procedure for the selected recommender.
  • similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.

All results, best hyperparameters and dataset splits are saved in the experiments directory.


Testing on test set with best hyperparameters

In order to test each tuned recommender on the test set (which is created when tuning the hyperparameters) run the following command:

python RunBestParameters.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>] [--force] [--bp <best-params-dir>]
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
  • --user or --item is a flag used only for GAN-based recommenders. It denotes the user/item based training procedure for the selected recommender.
  • similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.
  • --force is a flag that forces the computation of the results on test set. By default, if the result for the tuple (dataset, recommender) exists in test_result directory, the computation is not performed.
  • --bp sets the directory where the best parameters (best_params.pkl) are located for this combination of (dataset, recommender), by default in experiments directory.

The results are saved in the test_results directory.


Ablation study

To run the ablation study, use the script AblationStudy.py as follows:

python AblationStudy.py <dataset-name> [binGANMF | feature-matching [--user | --item]]
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • binGANMF runs the first ablation study, the GANMF model with binary classifier discrimnator. This tunes the recommender with RecSysExp.py and then evaluates it with RunBestParameters.py on the test set.
  • --user or --item is a flag that sets the training procedure for binGANMF recommender.
  • feature-matching runs the second ablation study, the effect of the feature matching loss and the user-user similarity heatmaps. The results are saved in the feature_matching directory.

MF model of GANMF

To run the qualitative study on the MF learned by GANMF, use the script MFLearned.py as follows:

python MFLearned.py

It executes both experiments and the results are saved in the latent_factors directory.

Footnotes

  1. For the baselines Top Popular, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha and model evaluation we have used implementations from Recsys_Course_AT_PoliMi. ↩

Owner
Ervin Dervishaj
Interested in Recommender Systems and Machine/Deep Learning research
Ervin Dervishaj
This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re

John Benediktsson 7.3k Jan 03, 2023
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

FIERY This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in: FIERY: Future In

Wayve 406 Dec 24, 2022
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Neural Tangent Generalization Attacks (NTGA)

Neural Tangent Generalization Attacks (NTGA) ICML 2021 Video | Paper | Quickstart | Results | Unlearnable Datasets | Competitions | Citation Overview

Chia-Hung Yuan 34 Nov 25, 2022
LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021

LoFTR-with-train-script LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021 (with train script --- unofficial ---). About Megadepth

Nan Xiaohu 15 Nov 04, 2022
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
DIVeR: Deterministic Integration for Volume Rendering

DIVeR: Deterministic Integration for Volume Rendering This repo contains the training and evaluation code for DIVeR. Setup python 3.8 pytorch 1.9.0 py

64 Dec 27, 2022
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Stephen James 51 Dec 27, 2022
Predicting Student Attentiveness using OpenCV

Predicting-Student-Attentiveness-using-OpenCV The model will predict if a student is attentive or not through facial parameter received through the st

Johann Pinto 2 Aug 20, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 68 Jul 18, 2022
Code for the paper "Improved Techniques for Training GANs"

Status: Archive (code is provided as-is, no updates expected) improved-gan code for the paper "Improved Techniques for Training GANs" MNIST, SVHN, CIF

OpenAI 2.2k Jan 01, 2023
Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Training Reproduce of PSPNet. (Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with support

Li Xuhong 126 Jul 13, 2022
The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

DGMS This is the code of the paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks". Installation Our code works with Pytho

Runpei Dong 3 Aug 28, 2022
Using CNN to mimic the driver based on training data from Torcs

Behavioural-Cloning-in-autonomous-driving Using CNN to mimic the driver based on training data from Torcs. Approach First, the data was collected from

Sudharshan 2 Jan 05, 2022
General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)

General Virtual Sketching Framework for Vector Line Art - SIGGRAPH 2021 Paper | Project Page Outline Dependencies Testing with Trained Weights Trainin

Haoran MO 118 Dec 27, 2022
PyTorch original implementation of Cross-lingual Language Model Pretraining.

XLM NEW: Added XLM-R model. PyTorch original implementation of Cross-lingual Language Model Pretraining. Includes: Monolingual language model pretrain

Facebook Research 2.7k Dec 27, 2022
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation

CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation We propose a novel approach to translate unpaired contrast computed

Nicolae Catalin Ristea 13 Jan 02, 2023
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

36 Nov 23, 2022
Byzantine-robust decentralized learning via self-centered clipping

Byzantine-robust decentralized learning via self-centered clipping In this paper, we study the challenging task of Byzantine-robust decentralized trai

EPFL Machine Learning and Optimization Laboratory 4 Aug 27, 2022