Graph Neural Network based Social Recommendation Model. SIGIR2019.

Overview

Basic Information:

This code is released for the papers:

Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang and Meng Wang. A Neural Influence Diffusion Model for Social Recommendation. Accepted by SIGIR2019. pdf.
Le Wu, Junwei Li, Peijie Sun, Richang Hong, Yong Ge, and Meng Wang. DiffNet++: A Neural Influence and Interest Diffusion Network for Social Recommendation. Accepted by IEEE Transactions on Knowledge and Data Engineering in Dec 2020. pdf

Usage:

  1. Environment: I have tested this code with python2.7, tensorflow-gpu-1.12.0
  2. Run DiffNet:
    1. Download the yelp data from this link, and unzip the directories in yelp data to the sub-directory named diffnet of your local clone repository.
    2. cd the sub-directory diffnet and execute the command python entry.py --data_name=<data_name> --model_name=diffnet --gpu=<gpu id>
  3. Run DiffNet++:
    1. Download datasets from this link, and just put the downloaded folder 'data' in the sub-directory named diffnet++ of your local clone repository.
    2. cd the sub-directory diffnet++ and execute the command python entry.py --data_name=<data_name> --model_name=diffnetplus --gpu=<gpu id>
  4. If you have any available gpu device, you can specify the gpu id, or you can just ignore the gpu id.

Following are the command examples:
python entry.py --data_name=yelp --model_name=diffnet
python entry.py --data_name=yelp --model_name=diffnetplus

Citation:

The dataset flickr we use from this paper:
 @article{HASC2019,
  title={A Hierarchical Attention Model for Social Contextual Image Recommendation},
  author={Le, Wu and Lei, Chen and Richang, Hong and Yanjie, Fu and Xing, Xie and Meng, Wang},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2019}
 }

 The algorithm is from DiffNet and DiffNet++:
 @inproceedings{DiffNet2019.
 title={A Neural Influence Diffusion Model for Social Recommendation},
 author={Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang and Meng Wang},
 conference={42nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
 year={2019}
 }

 @article{wu2020diffnet++,
  title={DiffNet++: A Neural Influence and Interest Diffusion Network for Social Recommendation},
  author={Wu, Le and Li, Junwei and Sun, Peijie and Ge, Yong and Wang, Meng},
  journal={arXiv preprint arXiv:2002.00844},
  year={2020}
 }
 
 We utilized the key technique in following paper to tackle the graph oversmoothing issue, and we have annotated
 the change in line 114 in diffnet/diffnet.py, if you want to konw more details, please refer to:
 @inproceedings{
 title={Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach},
 author={Lei Chen, Le Wu, Richang Hong, Kun Zhang, Meng Wang},
 conference={The 34th AAAI Conference on Artificial Intelligence (AAAI 2020)},
 year={2020}
 }

Author contact:

Email: [email protected], [email protected]

Comments
  • How do you set the parameter for SVD++?

    How do you set the parameter for SVD++?

    Hi, when I run SVD++ on yelp, self.final_user_embedding = self.user_embedding + user_embedding_from_consumed_items, while during 300 epochs, the best performance just achieve hr: 0.1282, ndcg:0.0770,

    opened by ll0ruc 7
  • 社交扩散过程

    社交扩散过程

    您好!对于社交扩散过程不是很理解, first_gcn_user_embedding=self.generateUserEmbeddingFromSocialNeighbors(self.fusion_user_embedding) second_gcn_user_embedding=self.generateUserEmbeddingFromSocialNeighbors(first_gcn_user_embedding) 代码中这应该是两层所表示的扩散过程,但是不是可以这样理解每次都是前一层的输出乘以 社交关系矩阵呢,这样不弱化了社交关系影响了吗?

    opened by lin171110 5
  • About the model performance of diffnetplus

    About the model performance of diffnetplus

    Hi, I run the diffnetplus code on yelp data, get HR=0.3676,NDCG=0.2274, which is consistent with the performance in paper. When I remove the user and item features byself.fusion_user_embedding = self.user_embedding, self.fusion_item_embedding = self.item_embedding. get HR=0.3104, NDCG=0.1877, which is lower than Diffnet++-nf in paper. Are there any adjusted parameters when remove feature?

    opened by ll0ruc 3
  • Pretraining details

    Pretraining details

    Hi Peijie, thank you for open sourcing the models. I am trying to re-implement diffnet in pytorch and it would be a great help if you could let me know the details of model pretraining. Specifically, how was the model by the name "diffnet_hr_0.3437_ndcg_0.2092_epoch_98.ckpt" trained? Thanks in advance.

    opened by YH-UtMSB 3
  • 测试集中的负样本的生成方式似乎有bug

    测试集中的负样本的生成方式似乎有bug

    作者您好!

    在DataModule.py的generateEvaNegative函数里,对于某个用户,测试集里针对他随机生成的负样本应该同时避开他训练集和测试集里的正样本。但generateEvaNegative函数里的hash_data仅能指示当前样本是否为测试集里的正样本。这会导致训练集里的正样本有可能被采样成了测试集里的负样本。模型的实际性能会因此被低估。请问这个地方是不是有点bug?

    opened by ZikaiGuo 3
  • About Yelp Dataset

    About Yelp Dataset

    Hello! Can you provide the code for processing the Yelp dataset? I want to know the relationship between the id in the data set you provided and the original data set. I really appreciate your help

    opened by Archerxzs 2
  • Some implementation problems

    Some implementation problems

    Thanks for providing the source code for your work diffnet and diffnet++.

    Since I am a pytorch user, I want to reimplement your works with pytorch and more recent python version, so that more researchers can compare their works with yours. However, I was confused by some of your implementation details:

    1. In DataModule.py, you use generateConsumedItemsSparseMatrix() to get the user-item graph. As I understand, you make the train/valid/test data correspond to a data file and get the user-item graph described by this file, which cause severe problem: during testing, your model are avaliable with all true test data, which means that your model answer some questions it has the ground truth. Your training process also has the same data leakeage problem.
    2. In your paper 'A Neural Influence Diffusion Model for Social Recommendation' eq-4, you said you use a regularization parameter to control the complexity of user and item free embedding matrices. But in your code diffnet.py, you seems only compute the MSE loss between the ground truth and your predictions.
    opened by ty4b112 2
  • 请问user feature和item feature是必须的吗?

    请问user feature和item feature是必须的吗?

    作者您好!

    我想将diffnet用在一些不带user feature和item feature的数据集上。我将每个用户和物品的feature都改为长度很短的零向量(例如[0,0,0,0,0])。按照论文里的公式,这样改动后user feature和item feature都不会起作用,而社交信息依然能起作用。但我做了这样以后会导致训练时train loss、val loss、test loss均为nan,模型无法正常训练。

    请问我这样来去掉额外的feature信息是否可行?您是否试过把diffnet修改后用在不带user feature、item feature的数据集上? 谢谢!

    opened by ZikaiGuo 2
  • about effect of validation dataset

    about effect of validation dataset

    In the training phase, why should you calculate the loss on the verification set and the test set in each epoch? Isn't gradient descent performed only on the training set?

    opened by Cinderella1001 1
  • Consultation on the dataset preprocessing part

    Consultation on the dataset preprocessing part

    I wonder that how the datasets item_vector.npy (dimension (38342, 150)) and user_vector.npy (dimension is (17237, 150)) are obtained. Looking forward to your reply, I will be very grateful.

    opened by Cinderella1001 1
  • Bump paddlepaddle from 2.1.3 to 2.4.0 in /diffnet-paddlepaddle

    Bump paddlepaddle from 2.1.3 to 2.4.0 in /diffnet-paddlepaddle

    Bumps paddlepaddle from 2.1.3 to 2.4.0.

    Release notes

    Sourced from paddlepaddle's releases.

    PaddlePaddle 2.4.0 Release Note

    2.4.0 Release Note

    1. 重要更新

    • 新动态图架构正式生效:新动态图框架调大幅提升了调度性能,超90%API的调度性能提升超过50%,超50%套件模型性能提升超过5%,功能架构更加清晰,二次开发能力和体验显著增强。

    • 全面提升了飞桨的动静统一能力: 动转静功能提供了更加丰富的Python语法支持,飞桨的Python语法覆盖率达到90%,对语法转写逻辑进行了重点地优化,完备地支持了控制流语法,提供了更加流畅的一键转静态图体验;借助全新升级的静态图执行器,让动转静训练具有更优的加速能力,重点模型测试显示接近静态图最佳水平;提升了动转静的可扩展性,新增支持多函数合并导出和推理,支持用户使用PHI算子库进行二次开发和灵活部署,有效支撑语音领域U2++特色模型的自定义解码。

    • 新增稀疏计算类API: 新增55个稀疏API paddle.sparse.*,支持稀疏计算主流场景,已应用于3D点云目标检测、Sparse Transformers等任务的稀疏训练和推理部署,高稀疏度场景下相比使用DenseTensor提速105.75%,相比同类产品稀疏计算提速4.01%~58.55%;支持多种稀疏Tensor(SparseCoo 和 SparseCsr等)的计算,极致节省显存;同时保持了一致的使用体验,和稠密Tensor的API使用方式一致。

    • 大规模图神经网络GPU训练引擎: 通过SSD、内存、显存的异构层次化存储技术,突破显存瓶颈,支持超大规模图的全GPU存储和训练;实现了游走、采样、训练的全GPU一体化解决方案,相比传统的分布式CPU解决方案,相同成本的情况下训练速度提升10+倍。

    • 环境适配: 新增了适配CUDA11.7 版本的预编译安装包,新增了支持在Ubuntu 22.04及以上版本中运行。

    前瞻性预告

    • 飞桨框架将在2.5版本废弃对python 3.6的支持。
    • 飞桨框架将会逐步废弃python端的paddle.fluild命名空间下的API,在2.5版本时,部分该命名空间下的API将会被直接删除。

    2. 不兼容升级

    • 取消了适配CUDA10.1 版本的预编译安装包。
    • Tensor.clear_gradient(bool set_to_zero)接口不再接收kwargs传入的值,只能通过args传入set_to_zero的bool变量。
    • 为了提高显存利用效率,动态图默认仅保留前向叶子结点变量的梯度如训练中网络参数的梯度,而不再支持默认保留非叶子结点的梯度。如果需要保留特定Tensor的梯度,可以在反向执行前调用Tensor.retain_grads()接口。
    • paddle.autograd.PyLayer将不再支持输入是tuple的情况,如果输入希望是一组Tensor的情况请传入list of Tensor。

    3. 训练框架(含分布式)

    (1)新增API和增强API功能

    • 新增稀疏计算类API:paddle.sparse

    • 新增语音领域API: paddle.audio

      • 新增MFCC、Spectrogram、LogMelSpectrogram等特征提取API,支持GPU计算,相比CPU实现处理性能提升 15x 倍以上,可大幅提升语音模型训练GPU利用率。#45424
      • 新增窗函数、离散余弦变换等特征提取基础API,方便用户自定义语音特征提取。#45424
      • 新增语音 IO 模块,提供2种 音频I/O backend,支持6种编解码,便捷地实现语音数据的加载。 #45939
      • 新增TESS,ESC50语音分类数据集,方便用户完成经典语音分类模型。#45939
    • 新增图学习领域API: paddle.geometric

      • 图学习逐渐成为机器学习领域的关键技术,飞桨新增paddle.geometric模块提供更好的图学习建模和训练开发体验。
        • 消息传递:图学习消息传递机制是图建模的基础,因此新增7个图学习消息传递API,更方便完成进行图学习建模。其中,新增的3个消息传递融合算子可大幅减少图模型训练显存占用,稠密图场景下GCN系列模型可节省50%+显存,训练速度可提升20%+。#44848, #44580, #43174, #44970
        • 图采样:图采样是图模型训练的性能瓶颈,此次新增了高性能图采样算子,支持高并发图采样,GraphSage的采样速度可提升32倍以上,模型训练速度可提升12倍以上。#44970
    • 新增视觉领域API

      • paddle.vision新增目标检测领域算子paddle.vision.distribute_fpn_proposals(#43736), paddle.vision.generate_proposals(#43611), paddle.vision.matrix_nms(#44357), paddle.vision.prior_box和paddle.vision.box_coder(#47282)。
    • 增强API功能

      • 增加BatchNorm1D的大batch_size计算功能 #43072
    • 完善集合通信分布式训练API

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • About the diffnet on gcn layer is zero.

    About the diffnet on gcn layer is zero.

    Hi, I set diffnet with different K gcn layers. While for k =0, it means the self.final_user_embedding = self.fusion_user_embedding + user_embedding_from_consumed_items, and fixed other components. For yelp, during 300 epochs, I get the best performance: HR=0.2630, NDCG=0.1415. It looks terrible.

    opened by ll0ruc 0
  • 关于注意力分数的计算

    关于注意力分数的计算

    您好!我对在使用社交网络和兴趣网络更新用户表示过程中,注意力分数的计算有些疑问。 首先, 1662013313836 1662013390247 1662013597406 从以上代码可以看出 gama^(k+1)(a1) =1/2* self.consumed_items_attention,gama^(k+1)(a2) =1/2* self.social_neighbors_attention。gama^(k+1)(a1)和gama^(k+1)(a2)也确实是利用了使用到MLP的GAT实现的。

    我看到您论文中提及, 1662013776145 1662013813772 1662013823848 说alpha^(k+1)(ab)和beta^(k+1)(ai)的计算过程也是和gama的计算过程类似,使用MLP借助两个embedding得到。但是,我去看了源码中beta^(k+1)_(ai)的计算过程,我发现,beta与gama的计算过程存在差别,感觉并没有借助两个embedding,倒像是随机产生的。 1662014004306

    期待您的回复!

    opened by Cinderella1001 2
Releases(v1.0)
  • v1.0(Jul 12, 2020)

    This code has been used in SIGIR 2019, A Neural Influence Diffusion Model for Social Recommendation. You can download the corresponding paper by http://arxiv.org/abs/1904.10322 And this work is mainly based on the graph neural network technique.

    Source code(tar.gz)
    Source code(zip)
Owner
PeijieSun
PeijieSun
Knowledge-aware Coupled Graph Neural Network for Social Recommendation

KCGN AAAI-2021 《Knowledge-aware Coupled Graph Neural Network for Social Recommendation》 Environments python 3.8 pytorch-1.6 DGL 0.5.3 (https://github.

xhc 22 Nov 18, 2022
Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems

DANSER-WWW-19 This repository holds the codes for Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recom

Qitian Wu 78 Dec 10, 2022
reXmeX is recommender system evaluation metric library.

A general purpose recommender metrics library for fair evaluation.

AstraZeneca 258 Dec 22, 2022
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems.

Persine, the Persona Engine Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface a

Jonathan Soma 87 Nov 29, 2022
Books Recommendation With Python

Books-Recommendation Business Problem During the last few decades, with the rise

Çağrı Karadeniz 7 Mar 12, 2022
Temporal Meta-path Guided Explainable Recommendation (WSDM2021)

Temporal Meta-path Guided Explainable Recommendation (WSDM2021) TMER Code of paper "Temporal Meta-path Guided Explainable Recommendation". Requirement

Yicong Li 13 Nov 30, 2022
A Python implementation of LightFM, a hybrid recommendation algorithm.

LightFM Build status Linux OSX (OpenMP disabled) Windows (OpenMP disabled) LightFM is a Python implementation of a number of popular recommendation al

Lyst 4.2k Jan 02, 2023
Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

57 Nov 03, 2022
基于个性化推荐的音乐播放系统

MusicPlayer 基于个性化推荐的音乐播放系统 Hi, 这是我在大四的时候做的毕设,现如今将该项目开源。 本项目是基于Python的tkinter和pygame所著。 该项目总体来说,代码比较烂(因为当时水平很菜)。 运行的话安装几个基本库就能跑,只不过里面的数据还没有上传至Github。 先

Cedric Niu 6 Nov 19, 2022
Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"

Dependencies NOTE: This code has been updated, if you were using this repo earlier and experienced issues that was due to an outaded codebase. Please

Avishek (Joey) Bose 43 Nov 25, 2022
Graph Neural Network based Social Recommendation Model. SIGIR2019.

Basic Information: This code is released for the papers: Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang and Meng Wang. A Neural Influence Dif

PeijieSun 144 Dec 29, 2022
大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、DeepWalk、SSR、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、ListWise等

(中文文档|简体中文|English) 什么是推荐系统? 推荐系统是在互联网信息爆炸式增长的时代背景下,帮助用户高效获得感兴趣信息的关键; 推荐系统也是帮助产品最大限度吸引用户、留存用户、增加用户粘性、提高用户转化率的银弹。 有无数优秀的产品依靠用户可感知的推荐系统建立了良好的口碑,也有无数的公司依

3.6k Dec 30, 2022
Accuracy-Diversity Trade-off in Recommender Systems via Graph Convolutions

Accuracy-Diversity Trade-off in Recommender Systems via Graph Convolutions This repository contains the code of the paper "Accuracy-Diversity Trade-of

2 Sep 16, 2022
An Efficient and Effective Framework for Session-based Social Recommendation

SEFrame This repository contains the code for the paper "An Efficient and Effective Framework for Session-based Social Recommendation". Requirements P

Tianwen CHEN 23 Oct 26, 2022
A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

Nicolas Hug 5.7k Jan 01, 2023
Code for my ORSUM, ACM RecSys 2020, HeroGRAPH: A Heterogeneous Graph Framework for Multi-Target Cross-Domain Recommendation

HeroGRAPH Code for my ORSUM @ RecSys 2020, HeroGRAPH: A Heterogeneous Graph Framework for Multi-Target Cross-Domain Recommendation Paper, workshop pro

Qiang Cui 9 Sep 14, 2022
Group-Buying Recommendation for Social E-Commerce

Group-Buying Recommendation for Social E-Commerce This is the official implementation of the paper Group-Buying Recommendation for Social E-Commerce (

Jun Zhang 37 Nov 28, 2022
Spotify API Recommnder System

This project will access your last listened songs on Spotify using its API, then it will request the user to select 5 favorite songs in that list, on which the API will proceed to make 50 recommendat

Kevin Luke 1 Dec 14, 2021
Pytorch domain library for recommendation systems

TorchRec (Experimental Release) TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale

Meta Research 1.3k Jan 05, 2023
Recommender System Papers

Included Conferences: SIGIR 2020, SIGKDD 2020, RecSys 2020, CIKM 2020, AAAI 2021, WSDM 2021, WWW 2021

RUCAIBox 704 Jan 06, 2023