Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 short.

Last update: Dec 13, 2022

Related tags

Deep Learning pytorch

Overview

Session-aware BERT4Rec

Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 short.

Everything in the paper is implemented (including vanilla BERT4Rec and SASRec), and can be reproduced.

Usage

1. Build Docker

./scripts/build.sh

2. Download dataset

Download corresponding datasets into some directory, such as ./roughs.

For Steam dataset, use version 2.

Rename datasets: 'ml1m' for MovieLens-1M, 'ml20m' for MovieLens-2M, 'steam2' for Steam.

3. Preprocess

--rough_root: for original dataset files
--data_root: for processed data files

python preprocess.py prepare ml1m --data_root ./data --rough_root ./roughs
python preprocess.py prepare ml20m --data_root ./data --rough_root ./roughs
python preprocess.py prepare steam2 --data_root ./data --rough_root ./roughs

For some stats:

python preprocess.py count stats --data_root ./data --rough_root ./roughs > dstats.tsv

4. Run

See default configuration setting in entry.py.

To modify configuration, make some directory under runs/ like ./runs/ml1m/bert4rec/vanilla/, and create config.json.

Sample Run Script

My x0.sh file that uses GPU No. 0:

runpy () {
    docker run \
        -it \
        --rm \
        --init \
        --gpus '"device=0"' \
        --shm-size 16G \
        --volume="$HOME/.cache/torch:/root/.cache/torch" \
        --volume="$PWD:/workspace" \
        session-aware-bert4rec \
        python "$@"
}

runpy entry.py ml1m/bert4rec/vanilla

Terminologies

The df_ prefix always means DataFrame from Pandas.

uid (str|int): User ID (unique).
iid (str|int): Item ID (unique).
sid (str|int): Session ID (unique), used only for session separation.
uindex (int): mapped index number of User ID, 1 ~ n.
iindex (int): mapped index number of Item ID, 1 ~ m.
timestamp (int): UNIX timestamp.

Data Files

After preprocessing, we'll have followings in each data/:dataset_name/ directory.

uid2uindex.pkl (dict): {uid → uindex}.
iid2iindex.pkl (dict): {iid → iindex}.
df_rows.pkl (df): column of (uindex, iindex, sid, timestamp), with no index.
train.pkl (dict): {uindex → [list of (iindex, sid, timestamp)]}.
valid.pkl (dict): {uindex → [list of (iindex, sid, timestamp)]}.
test.pkl (dict): {uindex → [list of (iindex, sid, timestamp)]}.
ns_random.pkl (dict): {uindex -> [list of iindex]}.
ns_popular.pkl (dict): {uindex -> [list of iindex]}.

Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 short.

Related tags

Overview

Session-aware BERT4Rec

Usage

1. Build Docker

2. Download dataset

3. Preprocess

4. Run

Sample Run Script

Terminologies

Data Files

Code References

Owner

Jamie J. Seol

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

OCRA (Object-Centric Recurrent Attention) source code

This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

Air Pollution Prediction System using Linear Regression and ANN

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)

This is the source code of the solver used to compete in the International Timetabling Competition 2019.

This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

Semantic Segmentation Architectures Implemented in PyTorch

Auto White-Balance Correction for Mixed-Illuminant Scenes

SegNet including indices pooling for Semantic Segmentation with tensorflow and keras

We are More than Our JOints: Predicting How 3D Bodies Move

A Deep Learning based project for creating line art portraits.

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Variational autoencoder for anime face reconstruction

Classification models 1D Zoo - Keras and TF.Keras

Learning Saliency Propagation for Semi-supervised Instance Segmentation

Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch