GAN-based Matrix Factorization for Recommender Systems

Overview

GAN-based Matrix Factorization for Recommender Systems

GANMF architecture

This repository contains the datasets' splits, the source code of the experiments and their results for the paper "GAN-based Matrix Factorization for Recommender Systems" (arXiv: https://arxiv.org/abs/2201.08042) accepted at the 37th ACM/SIGAPP Symposium on Applied Computing (SAC '22).

How to use this repo

This repo is based on a version of Recsys_Course_AT_PoliMi. In order to run the code and experiments you need first to setup a Python environment. Any environment manager will work, but we suggest conda since it is easier to recreate our environment if using a GPU. conda can help with the installation of CUDA and CUDA toolkit necessary to utilize available GPU(s). We highly recommend running this repo with a GPU since GAN-based recommenders require long training times.

Conda

Run the following command to create a new environment with Python 3.6.8 and install all requirements in file conda_requirements.txt:

conda create -n <name-env> python==3.6.8 --file conda_requirements.txt

The file conda_requirements.txt also contains the packages cudatoolkit==9.0 and cudnn==7.1.2 which are installed completely separate from other versions you might already have installed and are managed by conda.

Install the following packages using pip inside the newly created environment since they are not found in the main channel of conda and conda-forge channel holds old versions of them:

pip install scikit-optimize==0.7.2 telegram-send==0.25

Activate the newly created environment:

conda activate <name-env>

Virtualenv & Pip

First download and install Python 3.6.8 from python.org. Then install virtualenv:

python -m pip install --user virtualenv

Now create a new environment with virtualenv (by default it will use the Python version it was installed with):

virtualenv <name-env> <path-to-new-env>

Activate the new environment with:

source <path-to-new-env>/bin/activate

Install the required packages through the file pip_requirements.txt:

pip install -r pip_requirements.txt

Note that if you intend to use a GPU and install required packages using virtualenv and pip then you need to install separately cudatoolkit==9.0 and cudnn==7.1.2 following instructions for your GPU on nvidia.com.

Before running any experiment or algorithm you need to compile the Cython code part of some of the recommenders. You can compile them all with the following command:

python run_compile_all_cython.py

N.B You need to have the following packages installed before compiling: gcc and python3-dev.

N.B Since the experiments can take a long time, the code notifies you on your Telegram account when the experiments start/end. Either configure telegram-send as indicated on https://pypi.org/project/telegram-send/#installation or delete the lines containing telegram-send inside RecSysExp.py.


Running experiments

All results presented in the paper are already provided in this repository. In case you want to re-run the experiments, below you can find the steps for each one of them.

Comparison with baselines1

In order to run all the comparisons with the baselines use the file RecSysExp.py. First compute for each dataset the 5 mutually exclusive sets:

  • Training set: once best hyperparameters of the recommender are found, it will be finally trained with this set.

    • Training set small: the recommender is first trained on this small training set with the aim of finding the best hyperparameters.
    • Early stopping set: validation set used to incorporate early stopping in the hyperparameters tuning.
    • Validation set: the recommender with the current hyperparameter values is tested against this set.
  • Test set: once the best hyperparameters are found, the recommender is finally tested with this set. The results presented are the ones on this set.

Compute the splits for each dataset with the following command:

python RecSysExp.py --build-dataset <dataset-name>

To run the tuning of a recommender use the following command:

python RecSysExp.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>] 
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
  • --user or --item is a flag used only for GAN-based recommenders. It denotes the user/item-based training procedure for the selected recommender.
  • similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.

All results, best hyperparameters and dataset splits are saved in the experiments directory.


Testing on test set with best hyperparameters

In order to test each tuned recommender on the test set (which is created when tuning the hyperparameters) run the following command:

python RunBestParameters.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>] [--force] [--bp <best-params-dir>]
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
  • --user or --item is a flag used only for GAN-based recommenders. It denotes the user/item based training procedure for the selected recommender.
  • similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.
  • --force is a flag that forces the computation of the results on test set. By default, if the result for the tuple (dataset, recommender) exists in test_result directory, the computation is not performed.
  • --bp sets the directory where the best parameters (best_params.pkl) are located for this combination of (dataset, recommender), by default in experiments directory.

The results are saved in the test_results directory.


Ablation study

To run the ablation study, use the script AblationStudy.py as follows:

python AblationStudy.py <dataset-name> [binGANMF | feature-matching [--user | --item]]
  • dataset-name is a value among: 1M, hetrec2011, LastFM.
  • binGANMF runs the first ablation study, the GANMF model with binary classifier discrimnator. This tunes the recommender with RecSysExp.py and then evaluates it with RunBestParameters.py on the test set.
  • --user or --item is a flag that sets the training procedure for binGANMF recommender.
  • feature-matching runs the second ablation study, the effect of the feature matching loss and the user-user similarity heatmaps. The results are saved in the feature_matching directory.

MF model of GANMF

To run the qualitative study on the MF learned by GANMF, use the script MFLearned.py as follows:

python MFLearned.py

It executes both experiments and the results are saved in the latent_factors directory.

Footnotes

  1. For the baselines Top Popular, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha and model evaluation we have used implementations from Recsys_Course_AT_PoliMi.

Owner
Ervin Dervishaj
Interested in Recommender Systems and Machine/Deep Learning research
Ervin Dervishaj
Rational Activation Functions - Replacing Padé Activation Units

Rational Activations - Learnable Rational Activation Functions First introduce as PAU in Padé Activation Units: End-to-end Learning of Activation Func

<a href=[email protected]"> 38 Nov 22, 2022
RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

RLMeta rlmeta - a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib Installation To build fro

Meta Research 281 Dec 22, 2022
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti

12 Jul 23, 2022
Artificial Intelligence search algorithm base on Pacman

Pacman Search Artificial Intelligence search algorithm base on Pacman Source The Pacman Projects by the University of California, Berkeley. Layouts Di

Day Fundora 6 Nov 17, 2022
[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

Pri3D: Can 3D Priors Help 2D Representation Learning? [ICCV 2021] Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-tr

Ji Hou 124 Jan 06, 2023
MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

MarcoPolo is a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering Overview MarcoPolo

Chanwoo Kim 13 Dec 18, 2022
TransVTSpotter: End-to-end Video Text Spotter with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 66 Dec 26, 2022
This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.

A three-stage detection and recognition pipeline of complex meters in wild This is the first released system towards detection and recognition of comp

Yan Shu 19 Nov 28, 2022
Image inpainting using Gaussian Mixture Models

dmfa_inpainting Source code for: MisConv: Convolutional Neural Networks for Missing Data (to be published at WACV 2022) Estimating conditional density

Marcin Przewięźlikowski 8 Oct 09, 2022
Implementation of Change-Based Exploration Transfer (C-BET)

Implementation of Change-Based Exploration Transfer (C-BET), as presented in Interesting Object, Curious Agent: Learning Task-Agnostic Exploration.

Simone Parisi 29 Dec 04, 2022
Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

AutoML-Freiburg-Hannover 26 Dec 12, 2022
Tool for working with Y-chromosome data from YFull and FTDNA

ycomp ycomp is a tool for working with Y-chromosome data from YFull and FTDNA. Run ycomp -h for information on how to use the program. Installation Th

Alexander Regueiro 2 Jun 18, 2022
A simple code to perform canny edge contrast detection on images.

CECED-Canny-Edge-Contrast-Enhanced-Detection A simple code to perform canny edge contrast detection on images. A simple code to process images using c

Happy N. Monday 3 Feb 15, 2022
A state-of-the-art semi-supervised method for image recognition

Mean teachers are better role models Paper ---- NIPS 2017 poster ---- NIPS 2017 spotlight slides ---- Blog post By Antti Tarvainen, Harri Valpola (The

Curious AI 1.4k Jan 06, 2023
基于Pytorch实现优秀的自然图像分割框架!(包括FCN、U-Net和Deeplab)

语义分割学习实验-基于VOC数据集 usage: 下载VOC数据集,将JPEGImages SegmentationClass两个文件夹放入到data文件夹下。 终端切换到目标目录,运行python train.py -h查看训练 (torch) Li Xiang 28 Dec 21, 2022

Implementation of character based convolutional neural network

Character Based CNN This repo contains a PyTorch implementation of a character-level convolutional neural network for text classification. The model a

Ahmed BESBES 248 Nov 21, 2022
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021) Hang Zhou, Yasheng Sun, Wayne Wu, Chen Cha

Hang_Zhou 628 Dec 28, 2022
DANA paper supplementary materials

DANA Supplements This repository stores the data, results, and R scripts to generate these reuslts and figures for the corresponding paper Depth Norma

0 Dec 17, 2021
Live training loss plot in Jupyter Notebook for Keras, PyTorch and others

livelossplot Don't train deep learning models blindfolded! Be impatient and look at each epoch of your training! (RECENT CHANGES, EXAMPLES IN COLAB, A

Piotr Migdał 1.2k Jan 08, 2023
GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-

VITA 298 Dec 12, 2022